Galvanize Week 0: Pre-course Primer

Prior to the program start, Galvanize gave us a set of pre-course assignments to get everyone equally prepared for Day 1. It came with a forewarning that the work would take anywhere from 20-40 hours. It was comprised of 8 chapters, including:

  1. Python
  2. Linear Algebra
  3. SQL
  4. Pandas
  5. Probability
  6. Statistics
  7. Hypothesis Testing
  8. Web Awareness (making a webpage with HTML & CSS)

Since I primarily had coding experience in MATLAB (6 yr) and R (1 yr), the Python chapter took me the longest, but I also enjoyed it the most. I spent the most time on an assignment focused on writing one-liner functions for tasks like performing a "perfect shuffle" of a list (halving a list, then recombining by alternating entries from the two halves). Another assignment of this chapter walked us through writing functions for what I think predictive text does: creating a dictionary of consecutive words ({(first word, second word): [third word, ...], ...}) and a counter dictionary of consecutive words ({first word: [second word: 1, ...], ...}) given some text file input, then connecting words together after hashing these dictionaries. Pretty cool!

We were also given the opportunity to attend an optional Python primer week (Week 0) last week. I was only able to make it to the last two days, but I still learned a lot! I attended Day 3, which covered object oriented programming (constructing classes), and Day 4, which was a review day over Days 1-3 (Python basics, dictionaries & sets, and object oriented programming).

The programming assignments they gave us involved building a sparse matrix class (should come in pretty handy later on!) and nesting object classes to process and manipulate weather data.

The biggest take-homes for me over those days were:

Mutable vs. immutable types
(ex., lists vs. ints):

a = [1, 2, 3]  
b = a  
a[0] = 100  
print b  

gives the output [100, 2, 3] because lists are mutable- a and b point to the same object that can change "in place" instead of changing by creating a new object.

However, for an immutable type, like integers:

a = 1  
b = a  
a = 100  
print b  

gives the output 1 because b is a new object created from a. Likewise, a hasn't "changed", but is now a new object of value 100.

So, it's definitely important to know which data types are mutable/immutable. I'm pretty sure I would have tripped up somewhere not knowing that lists are mutable types.

Magic methods
Magic methods (denoted with two leading and trailing underscores) are "behind-the-scenes" methods that structure objects. Say you create some object type (ex., Car). You would need a way of giving it some implicit characteristics or functionality.

The __init__ magic method defines what inputs you need for creating an instance of that object and what attributes (color, make, model, year) & methods (start engine, stop engine, honk) that object has.

The more interesting part of magic methods to me, is that magic methods define what happens when you call car1 + car2, car1 < car2, len(car1), and much more. Now you have freedom to customize however you like- for example:

  • __add__ to define car1 + car2 as adding mileage
  • __cmp__ to define car1 < car2 as comparing which car is older
  • __len__ to define len(car1) as length of car make

Private methods
Private methods (denoted with a single leading underscore) are a good way of organizing code by breaking it down to sub-functions. The single leading underscore (_private_method_name(args)) is a stylistic indicator that a function is private, meaning that it is only used within the file for internal processing purposes and shouldn't be implemented outside of the script.

Usually, the way I would write a function would consist of blocks of code that begin with comment lines to describe what the code block is doing. However, with private methods, the code blocks are replaced with a single call to the private method, so it looks a bit more structured and organized (plus, you can repeat code more easily if needed). I think getting into the habit of using private methods would be helpful for me in terms of practicing how to mentally break down coding problems into better organized and more digestible sections.

Overall, my first two days went well. It's a good environment for learning- it's comfortable for asking questions, people ask some challenging questions, and there is generally good dialogue. The two Data Scientists in Residence (DSRs) effectively act as teaching assistants and do a good job of involving us and floating around the room to check how we're doing.

I'm looking forward to our official start next week! Our instructor told us to take our long weekend to have fun and rest up for the next couple months ahead :)

comments powered by Disqus