My Python Deliberate Practice

6 minute read

First of all, don’t be afraid, read Plateau of Productivity. More importantly, be patient, a good read from Peter Norvig, titled Teach Yourself Programming in 10 years.

“Researchers have shown it takes about ten years to develop expertise in any of a wide variety of areas, including chess playing, music composition, telegraph operation, painting, piano playing, swimming, tennis, and research in neuropsychology and topology. The key is deliberative practice: not just doing it again and again, but challenging yourself with a task that is just beyond your current ability, trying it, analyzing your performance while and after doing it, and correcting any mistakes. Then repeat. And repeat again. There appear to be no real shortcuts” - Teach Yourself Programming in 10 years

Motivation

I mainly used Matlab extensively during my PhD program for different purposes; to carry out simulation, data analyses and visualization.

By now, I have a pretty good working knowledge of Matlab. There are obviously many more things that I can learn - in particular building and maintaining matlab modules as well as more advanced Matlab materials. However, the appeal to Python has always been there for me for a few reasons as I focus on machine learning and data science:

  • It’s a general purpose programming language, so presumably it is a lot easier to learn good software engineering principles. (What are they though?)
  • Many of the data stacks are built using the tools in the Python ecosystem (ETL using Airflow, Front-end using Flask with RESTful API supports, Machine Learning using scikit-learn) - being able to use the same language for different parts of the data stack will bring prototypes closer to production.

To me, the appeal of Python is not only necessarily the Data Analysis part, the appeal of using Python for data work is that you have a higher chance to see how data plays a role within the whole integrated technology stack. Knowing Python is likely to make me a better end-to-end Data Scientist and better machine learning Engineer.

Here is a great reddit answer that explains the intersection and disjoint union of the two languages beautifully.

Deliberate Practice

I am a huge believer in learning by doing, and there are a lot of opportunities on the job where I can hone my Python skills through Deliberate Practice:

  • Identify the Top Performers: I think there are quite a few people (e.g. Robert C.) who can really be a role model for me to follow. Understand what they’ve been through to get to where they are today. What is their mental representation that I do not have about Python.

  • Build Practice Plans: Ideally, based on the rough understanding of that mental representation:

    • Define clear goals and select learning materials
    • Create deadline and milestones for the project
    • Estimate time required and come up weekly schedules

    Augment these insights with your current level of mental representation of Python to improve your understanding.

  • Targeted Practice: If I force myself to switch over to Python for Data Analysis, Data visualization, Modeling, or contribute to open source Python Data Analysis packages, I can maximize my time practicing this skill, which is high leverage.

  • Immediate Feedbacks: The importance of feedback cannot be overemphsized. I have built the culture of constantly sending my codes to friends, my connections online for review, critique and feedback. Find constant opportunities to get feedback as much as you can.

Performance Goals

  • [Immediate] Learn to write pythonic code
  • [Shorter term, easiest to practice] Write re-usable, modular, tested code for my data work and knowledge posts
  • [Medium term, harder to practice] Achieve efficiency and feature parity on Data Analysis using Python compared to R
  • [Longer term, hardest to practice] Write tools. Being able to work on projects that span the entire data stack using Python, apply good software engineering principles to these projects

Project Goals

  • Outcome: I want to move my data stack to Python completely. This means my day-to-day data analysis work will be done in Python instead of R, make my code as pythonic as possible. Become a Contributor to Airpy / tools, and take on one bigger Python project (ML, Data Viz …etc).

  • Curriculum: I want do everything that I can to go through all the basic materials in Pandas/Matplotlib combo. Expose myself to functional programming, OOP, testing in Python, or even making command tools. Get feedbacks from experts.

  • Timeframe: Efficiency parity by end of December, 2017. One ongoing big project touching different stacks in Python by the end of 2017.

Project Milestones

Next Steps / Level In 2018

Once mastered all the above, the next natural step is to create public work that other people can use so you can democratize your useful tool to others. A great introduction to how to get started is from Tim Hopper’s talk, titled Sharing Your Side Projects.

Reference

Leave a Comment