Wednesday, July 18, 2012

Learning Python

Lately for a few different reasons I have been working on learning python. A few years ago I spent some time going over the basics and wrote a few scripts but I never spent much time with it. This time around I have spent more time with it, especially learning the ins and out of Tkinter, which is the standard GUI library for python. First I wanted to give my general impression of python.

To give an analogy I will compare different programming languages to different types of people. Programming in C is like dealing with a very intelligent, but casual, math professor who is in every way a normal person except that he constantly keeps having you solve some of the weirdest and most esoteric problems anyone can find, and you can't help but wonder how this applies to the real world. Programming in C++ is the same way except the math professor is the chair of the department, and he wrote the book. Of course this means that FORTRAN is that elderly professor in the department that has been teaching since before the current set of associate professors were born.

Programming in Matlab is like having a very intelligent roommate that can fix just about any electronic device, except he can't cook pasta without ruining it. In other words, he is great at his one thing, beyond that, go and find someone else.

Programming in python is like dealing with a Chinese online gamer with ADD.

I think that Chinese is an apt way of describing python. Just as with Chinese where there are thousands of "basic" characters in the language that can be modified and combined in unique ways to give new or additional meanings to the language. In python it is not uncommon to come across a statement like this: "With this function there are 140 different modes that can be used." (I actually read that in some online documentation.)

Just as with Chinese, in python there are thousands of "basic" commands, or even just hundreds of "standard" libraries, and all of these can be combined and/or modified by many different modes and functions that can produce very unique results. There are so many different options that I would estimate that it would take about a year to become proficient in the "basics" of python. The reason why I said that python is like a Chinese gamer with ADD is because even though the documentation for python is very extensive, it always seems incomplete. While learning python I have come across a number of online databases that seem very promising at first, but ultimately end with something like "To be updated later!" (Contrast that with the API for C++ in Windows which has every single option and command documented three times across three different sites in excruciating detail, and those are just the sites maintained by Microsoft. Granted that C++ has been around longer and the Windows API is used more, but still...) If I don't run into the "to be updated later" problem then it is almost always, "There are 140 different modes for this function, and we will talk about the 7 most common ones, and don't even think about learning about the other 133 other potentially useful modes for this function, of which you will most likely really want to use 10 or 12 of them, but we won't talk about all the useful stuff here because nothing can ever be complete in any documentation for python. MWAHAHAHAHAHA!!!"

So if python is like Chinese, then C is like learning English. There are at most 100 useful characters in the English language, but from all those simple characters we can build the full range of literary thought. It just takes more characters to build an equivalent thought. For example, I wrote a simple GUI in C++ that even unfinished was several thousand lines long, spread over 5 different files. I wrote the equivalent GUI in python and it took 100 lines and one file. Even though my python script was shorter and more visually pleasing, programming in C++ felt like I was creating something epic, personal and with infinite variety. Programming in python felt like dealing with an annoying Chinese gamer who couldn't hold a coherent conversation.

In the future I definitely plan on using python for somethings, because despite the ADD aspect, it is very useful. It is good at doing in simple front end stuff that people have to deal with frequently, but don't need to get into the blood and guts of it. For everything else I prefer C or C++, and Matlab for all my data processing. I intend to use python to quickly and seamlessly integrate many different C programs and associated output, but not for any other high end stuff that requires real programming. Using python for anything else would drive me nuts.

Python does have a slightly different flow to it than most other programming languages, and it takes some getting used to, but once the quirks are learned it gets better.


  1. " I have been working on learning python. "

    Wahoooooo!   Welcome to the light!

    "Programming in python is like dealing with a Chinese online gamer with ADD."

    Waa...Woo... wait what?!?  It's funny how people view the same things so differently.  I think python is simpler then C for example and *personally* have experienced most people feel the same way but it is possible I hang around a biased bunch.

    Anyways, good luck with Python.  I like it a lot but realize others don't and will say I have warmed up to FORTRAN, assuming it is FORTRAN 95 of newer, and matlab as well compared to how I started as a grad student since FORTRAN 95 has matlab/python like array gymnastics with slicing and things like the where function and elemental declaration that allows function to act on entire arrays elementally like matlab/python does etc...

    C languages don't let you play with such array gymnastics.

  2. As a python neophyte, I have to agree on the ADD Chinese gamer analogy.  When I work in python, I am always frustrated because it seems as if there is already a function to do what I want done, but I can't quite get it to work and I never know why.  I think this gets back to the lack of documentation issue, for the most part.

    That being said, it is far superior to IDL and far cheaper than Matlab, so if I ever were to teach a numerical techniques class similar to the computational labs at BYU, I would do it in python.

  3.  Our code is Fortran 95 and while I appreciate the features you mentioned, the drawback to Fortran as opposed to C/C++ is that there are virtually no high performance computing tools for Fortran while their are quite a few for C/C++.  Things like parallel de-buggers and code-profiling tools simply are not available for Fortran 95, which makes code development a lot more difficult than with C/C++.  Oddly, however, man of these tools do exist for Fortran 77.

  4. I think I would go crazy trying to teach numerical techniques using python. Part of my reasoning is that python is intended for a different crowd and thus most of the online help would not be helpful. Matlab on the other hand is built for it, and in many cases has internal help files for the techniques I would be teaching. Thus it is easier for the students to find help understanding how things work.

    Also there are subtle things about Matlab that make it easier for students to learn numerical techniques, such as the required "end" at the end of for loops and if statements. This may not seem like much but for beginners it is a major help in organizing the code. Also the Matlab IDE is unparalleled anywhere else in all of programming. Microsoft Visual Studio comes second (yes that's right, a Microsoft product really is one of the best there is).

    But having made the case for Matlab, I can see how python would be great in some ways for learning numerical techniques. Things that it took me 2 or 3 weeks to figure out in C++ I figured out in less than a day in python.

    ....should be taken down some alley and put out of its misery.

  5. There are four issues I have with matlab where I feel Python is superior beyond coding symantics

    1.  The first is the obvious licensing fee which I believe is not a trivial problem.  For example, I am about to be a postdoc where this isn't a problem.  But I know a postdoc with a fellowship with an $8000 research budget moving to a university where now he has to pay for the non-student licensee.  Since he has all this elaborate Matlab code he needs to use he needs a lot of packages running the bill to like $1000.  Something like %12 of his entire research budget is going to go to Matlab!   That is non-trivial given he still has to buy a computer, pay for conferences and his own papers now that he has his own budget and the rest doesn't go very far.

    2.  Type python astronomy library into Google and see how many links pop up with a lot of suff.  Type in cosmology library etc... There is a lot of python libraries.  Now type Matlab astronomy library and you will get one old unmaintained largely useless library and nothing else.  I think this is true for a lot of physics fields.  There are a lot of community maintained physics libraries for Matlab.  You have this nice GUI but have to write things from scratch that Python users get for "free".

    3.  Make a plot in Matlab and save as a png or jpeg.  Now save as an eps.  Only the eps looks professional.  I send just as many plots to colleagues in png and jpeg so they can view in their web browser as I make eps for papers.   So I am forced, if I want the plot to be professional, to save in eps then convert to png/jpeg by making a screenshot if I use Matlab.  Python just save the png and jpeg just as nice as the eps.  

    Now this is just a pet peeve, but still.  What can't the paid Matlab devs not figure out how to make png plots look as nice as the eps ones?  The "work-for-free" Python devs can.

    4.  So often I have an expresion in Matlab like a.*b where a and b are arrays and only want like the first x elements.  In python I can type (a*b)[0:10] but in matlab I have to do c = a.*b; c(1:10).  Ie... in Python I can index an entire expression no matter how complex like (expression)[slice] where in Matlab I have to make multiple lines since I can't index expressions only arrays.

    So with all those pet peeves let me give the one feature of Matlab that puts Python and all other languages to shame: code cells!!!

    Once you go code cell you never go back!

  6. Debuggers?  Print statements my friend is all you need. :)  Just kidding.  (And FORTRAN has the stop statement which is nice while debugging so that you can get a program to stop at an exact place. But I guess that is just old school.)

  7. PS.  Despite my Matlab rant I prefer matlab for a lot of things.  I have the "old married couple" relationship with Matlab in that I love to use it on one hand and on the other there are just these few annoying things I can't let go...

    My biggest issue with Python is lack of code cells!, a nice GUI and the fact that though there are a lot of libraries they are scatterbrained enough (since it is all 3rd party stuff) that it takes some work figuring out how to use them.  

  8. PPS.  I don't have an old married couple relationship with IDL... I have a "I want an immediate divorce but no matter how hard I try I can't" style relationship.

  9.  I agree 100%.  IDL was actually developed by a guy that worked for my adviser in the 1980's and CU has a sweet-heart site-license (free for univeristy-owned machines, $5 per student, staff, or faculty machine), so my research group will never stop using it.  Add to that the 10+ person-years of analysis code we have developed and you can see why I'm in an abusive relationship with IDL and I just can't seem to leave.

    I tell myself "IDL only requires 15 lines of code to save a plot as a post-script because deep-down he really loves me."


To add a link to text:
<a href="URL">Text</a>