My Python Deliberate Practice
First of all, don’t be afraid, read Plateau of Productivity. More importantly, be patient, a good read from Peter Norvig, titled Teach Yourself Programming in 10 years.
“Researchers have shown it takes about ten years to develop expertise in any of a wide variety of areas, including chess playing, music composition, telegraph operation, painting, piano playing, swimming, tennis, and research in neuropsychology and topology. The key is deliberative practice: not just doing it again and again, but challenging yourself with a task that is just beyond your current ability, trying it, analyzing your performance while and after doing it, and correcting any mistakes. Then repeat. And repeat again. There appear to be no real shortcuts” - Teach Yourself Programming in 10 years
Motivation
I mainly used Matlab extensively during my PhD program for different purposes; to carry out simulation, data analyses and visualization.
By now, I have a pretty good working knowledge of Matlab. There are obviously many more things that I can learn - in particular building and maintaining matlab modules as well as more advanced Matlab materials. However, the appeal to Python has always been there for me for a few reasons as I focus on machine learning and data science:
- It’s a general purpose programming language, so presumably it is a lot easier to learn good software engineering principles. (What are they though?)
- Many of the data stacks are built using the tools in the Python ecosystem (ETL using Airflow, Front-end using Flask with RESTful API supports, Machine Learning using scikit-learn) - being able to use the same language for different parts of the data stack will bring prototypes closer to production.
To me, the appeal of Python is not only necessarily the Data Analysis part, the appeal of using Python for data work is that you have a higher chance to see how data plays a role within the whole integrated technology stack. Knowing Python is likely to make me a better end-to-end Data Scientist and better machine learning Engineer.
Here is a great reddit answer that explains the intersection and disjoint union of the two languages beautifully.
Deliberate Practice
I am a huge believer in learning by doing, and there are a lot of opportunities on the job where I can hone my Python skills through Deliberate Practice:
- 
    Identify the Top Performers: I think there are quite a few people (e.g. Robert C.) who can really be a role model for me to follow. Understand what they’ve been through to get to where they are today. What is their mental representation that I do not have about Python. 
- 
    Build Practice Plans: Ideally, based on the rough understanding of that mental representation: - Define clear goals and select learning materials
- Create deadline and milestones for the project
- Estimate time required and come up weekly schedules
 Augment these insights with your current level of mental representation of Python to improve your understanding. 
- 
    Targeted Practice: If I force myself to switch over to Python for Data Analysis, Data visualization, Modeling, or contribute to open source Python Data Analysis packages, I can maximize my time practicing this skill, which is high leverage. 
- 
    Immediate Feedbacks: The importance of feedback cannot be overemphsized. I have built the culture of constantly sending my codes to friends, my connections online for review, critique and feedback. Find constant opportunities to get feedback as much as you can. 
Performance Goals
- [Immediate] Learn to write pythonic code
- [Shorter term, easiest to practice] Write re-usable, modular, tested code for my data work and knowledge posts
- [Medium term, harder to practice] Achieve efficiency and feature parity on Data Analysis using Python compared to R
- [Longer term, hardest to practice] Write tools. Being able to work on projects that span the entire data stack using Python, apply good software engineering principles to these projects
Project Goals
- 
    Outcome: I want to move my data stack to Python completely. This means my day-to-day data analysis work will be done in Python instead of R, make my code as pythonic as possible. Become a Contributor to Airpy / tools, and take on one bigger Python project (ML, Data Viz …etc). 
- 
    Curriculum: I want do everything that I can to go through all the basic materials in Pandas/Matplotlib combo. Expose myself to functional programming, OOP, testing in Python, or even making command tools. Get feedbacks from experts. 
- 
    Timeframe: Efficiency parity by end of December, 2017. One ongoing big project touching different stacks in Python by the end of 2017. 
Project Milestones
- Learning Python & Best Practices
- Writing Pythonic Code
    - Guidelines For Writing Pythonic Code
        - Function: Use *args and **kwargs to accept arbitrary arguments in function definition
- Tuples: effective unpacking, use _ for placeholder, swap values without tmp variables
- List/Dict/Set: list comprehension, dict comprehension. dict.get, set comprehension
- Strings: use .format, use .join
- Classes: use __ __ in function and variable name to mark private variables
- Generator: use generator to lazily load a infinite sequence
- Modules: writing modules for encapsulation
- Formatting: pep8 standards
- Executable script: name = main
- Import: The right way to do imports
 
- Writing Idiomatic Python - Jeff Knupp
- Stanford CS 41: Idiomatic Python
- Another Tutorial On How To Write Pythonic Code
 
- Guidelines For Writing Pythonic Code
        
- 
    iPython Notebook 
- 
    Pandas For Data Analysis 
- 
    Data Visualization - BIDS: Python Bootcamp: Intro to Matplotlib The 800 pound gorilla, everything is customizable, but very low level
- Seaborn Good for statistical visualization. I still find it a bit limited on the type of simple plots it can do
- Bokeh Interactive, web browser base data visualization
- A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair)
 
- 
    Writing Object Oriented Programming Python Code - Computational Biology: OOP For Scientist
- Improve Your Python: Jeff Knupp: OOP
- BIDS: Python Bootcamp: OOP
- Simeon Franklin’s Twitter University Class (not available to the public)
 
- 
    Writing Functional Programming Python Code 
- Machine Learning In Python
- 
    Testing In Python 
Next Steps / Level In 2018
Once mastered all the above, the next natural step is to create public work that other people can use so you can democratize your useful tool to others. A great introduction to how to get started is from Tim Hopper’s talk, titled Sharing Your Side Projects.
- Logging In Python (Next Year?)
- Writing Command-Line Tool (Next Year?)
- Building Packages In Python (Next Year?)
Reference
- Python Tutor Visualizer
- Python For Data Analysis
- Stanford CS 41: Python
- Berkeley CS 88: Python Data Structure
- Harvard CS 109: Data Science
- Berkeley BIDS Python bootcamp
- Josh Bloom’s Python Computing For Data Science
- Writing Idiomatic Python - Jeff Knupp
- Another Tutorial On How To Write Pythonic Code
- Pandas Cookbook
- Udemy course
 
      
    
Leave a Comment