Difference between revisions of "LaTeX source code for my BIOF309 syllabus"

From Colettapedia
Jump to navigation Jump to search
(Created page with "<pre> \documentclass{article} \usepackage{hyperref} % Any usepackages must be mentioned after the \documentstyle stmt % The preamble begins...")
 
 
Line 7: Line 7:
 
                           % The preamble begins here.
 
                           % The preamble begins here.
 
\title{BIOF 309 - Introduction to Python}  % Declares the document's title.
 
\title{BIOF 309 - Introduction to Python}  % Declares the document's title.
\date{Spring 2013}  % Deleting this command produces today's date.
+
\date{Spring 2014}  % Deleting this command produces today's date.
 
\author{Chris Coletta (lead instructor) - \href{mailto:christopher.coletta@gmail.com}{\texttt{christopher.coletta@gmail.com}} \\  
 
\author{Chris Coletta (lead instructor) - \href{mailto:christopher.coletta@gmail.com}{\texttt{christopher.coletta@gmail.com}} \\  
 
               Matt Shirley - \href{mailto:mdshw5@gmail.com}{\texttt{mdshw5@gmail.com}} \\
 
               Matt Shirley - \href{mailto:mdshw5@gmail.com}{\texttt{mdshw5@gmail.com}} \\
 +
              Anatoly Dryga - \href{mailto:anatoly.dryga@nih.gov}{\texttt{anatoly.dryga@nih.gov}} \\
 
               Ben Busby - \href{mailto:ben.busby@gmail.com}{\texttt{ben.busby@gmail.com}} }    % Declares the author's name.
 
               Ben Busby - \href{mailto:ben.busby@gmail.com}{\texttt{ben.busby@gmail.com}} }    % Declares the author's name.
  
Line 19: Line 20:
  
 
\begin{itemize}
 
\begin{itemize}
   \item  Class forum: \href{https://groups.google.com/d/forum/faes-biof309-spring-2013}{\texttt{groups.google.com/d/forum/faes-biof309-spring-2013}}
+
   \item  Class forum: \href{https://groups.google.com/d/forum/faes_biof309_spring2014}{\texttt{https://groups.google.com/d/forum/faes-biof309-spring2014}}
   \item  Group email address: \href{mailto:faes-biof309-spring-2013@googlegroups.com}{\texttt{faes-biof309-spring-2013@googlegroups.com}}
+
   \item  Group email address: \href{mailto:faes-biof309-spring2014@googlegroups.com}{\texttt{faes-biof309-spring2014@googlegroups.com}}
 
\end{itemize}
 
\end{itemize}
 
+
\centerline{\textit{This document last revised on Monday, January 27, 2014}}
 +
\section{Course Description}
 +
This course is intended to teach research professionals without a background in programming to write programs to gain insight into data. In addition to covering tools and syntax that are specific to Python, the class will cover elementary concepts that are ubiquitous in modern software engineering, including object-oriented programming, regular expressions, reading from and writing to text files, recursion, use of the debugger, etc. The end of the course will focus on potential applications of the Python language to bioinformatics, including sequence analysis, machine learning, and data visualization.
 
\section{Logistics}
 
\section{Logistics}
The class will meet sixteen times during the semester at 5:30pm every Thursday from January 31, 2013 until May 16, 2013. Our classroom will be in Building 10 on the main NIH campus in Bethesda, room 1C726.
+
The class will meet sixteen times during the semester at 5:30pm every Thursday from January 30, 2014 until May 15, 2014. Our classroom will be in Building 10 on the main NIH campus in Bethesda, room B1C211. Many lectures will be recorded ahead of time as a screencast which students can watch over the Internet at their leisure. It is hoped that this will free up the meeting time to serve as a discussion section or a time to do class exercises, as well as allow the student to learn at his or her own pace. It is the responsibility of the student to watch the lecture before coming to class in addition to submitting homework assignments. \emph{Students are encouraged to use the Google groups email list (link above) to ask for help or pose a question, rather than email the instructor directly.}
 
 
Many lectures will be recorded ahead of time as a screencast which students can watch over the Internet at their leisure. It is hoped that this will free up the meeting time to serve as a discussion section or a time to do class exercises, as well as allow the student to learn at his or her own pace. It is the responsibility of the student to watch the lecture before coming to class in addition to submitting homework assignments.
 
 
\section{Required Materials}
 
\section{Required Materials}
Each student is required to have a computer running Python 2.7. Choice of operating system is left to the student's discretion. There is no required textbook for this course, but the student may find the text Bioinformatics Programming Using Python by Mitchell L. Model as a useful reference.
+
Each student is required to have a computer running Python 2.7. Instruction will be given using Python 2.7, while syntactic differences between 2.x and 3.x will be highlighted. When installing Python, students are recommended to use the free \href{https://store.continuum.io/cshop/anaconda}{Anaconda Scientific Python Distribution}, which bundles core Python with many essential \& useful 3rd party modules. Choice of operating system is left to the student's discretion.  
 +
\subsection{Optional Textbooks}
 +
There is no required textbook for this course, but students may find these textbooks useful:
 +
\begin{itemize}
 +
  \item Downey, Allen B. Think Python. \href{http://www.greenteapress.com/thinkpython/html/index.html}{Available free online}
 +
  \item Model, Mitchell L. Bioinformatics Programming Using Python. 2009, O'Reilly Media.
 +
  \item McKinney, Wes. Python for Data Analysis. 2nd ed. 2013, O'Reilly Media.
 +
  \item Jones, Martin. Python for Biologists. 1st ed. 2013, CreateSpace Independent Publishing Platform.
 +
\end{itemize}
 
\section{Grading}  
 
\section{Grading}  
 
\subsection{Homework}
 
\subsection{Homework}
Homework will count for 80\% of the final grade. Homework should be submitted by email to the instructor who assigned it before the beginning of each class. Homework will be graded based on the following rubric:\\ \\
+
Homework will count for 80\% of the final grade. Homework should be submitted via email attachment to the instructor who assigned it before the beginning of class on the day it is due. \emph{N.B.: ANY FILES SUBMITTED MUST ADHERE TO THE FOLLOWING NAMING CONVENTION: lastname\_firstinitial\_hw\#.py, i.e., no spaces, name appears first and assignment number appears second in the filename.} Homework must be able to be executed using a Python 2.7 interpreter -- submitted programs written in Python 3.x is not acceptable. Grading will be based on the following rubric:\\ \\
 
Program runs, contains useful comments, meaningful variable names, follows "Pythonic" coding conventions : A \\
 
Program runs, contains useful comments, meaningful variable names, follows "Pythonic" coding conventions : A \\
 
Program runs, produces correct result: B \\
 
Program runs, produces correct result: B \\
Line 38: Line 47:
 
Program does not run: F
 
Program does not run: F
 
\subsection{Final Project}
 
\subsection{Final Project}
The final project will count for 20\% of the final grade. Each student will write a program to solve a problem relevant to their work or studies. In addition to submitting the program, the student will be responsible for determining and writing up the requirements for the program (what will it do? who will use it?), and presenting the program to the class (10 minutes).
+
The final project will count for 20\% of the final grade. Each student will write a program to solve a problem relevant to their work, studies or interests. The student will have wide latitude in establishing the parameters of their project, with the expectation that the program will do one or more of the following: organize, manipulate, analyze, visualize, interpret some data set, or perform a calculation. Each student be responsible for determining and writing up the requirements for the program (what will it do? who will use it?), and making a five minute presentation about their project to the class at the end of the semester.
 
\section{Schedule}   
 
\section{Schedule}   
Extra credit due the first day of class: Directions on how to make a peanut butter and jelly sandwich.
+
\subsection*{Week 1 - January 30, 2014 - Introduction and Installation}
\subsection*{Week 1 - January 31 - Introduction and Installation}
+
Housekeeping issues; How to get help; Discussion of programming language ecosystem and where Python fits in; What makes Python distinctive; The Python data analysis stack: core Python + essential 3rd party modules; Setup of Environment; Using Python interactively via IPython notebook; Running a program. Homework: Email me the magic number.
Housekeeping issues; how to get help; brief history of programming languages and Python; What makes Python distinctive; Core Python vs addons; Demonstration; Setup of Environment. Homework: Email me the magic number.
+
\subsection*{Week 2 - February 6, 2014 - Python Primatives}
\subsection*{Week 2 - February 7 - Python Primatives}
+
Exceptions. Named Values(Variables); Core Python types; Conversion between types; Math expressions; Matrix operations using NumPy; Strings, with escape characters. \textbf{\textit{Last day to withdraw from class with full refund is Friday, February 7.}}
Interactive Python shell; Math expressions; Matrix operations using NumPy; Input and Variables; Comments. Strings, with escape characters; Running programs from the command line. Exceptions. Homework: TBA
+
\subsection*{Week 3 - February 13, 2014 - Logic, Lists, and Loops}
\subsection*{Week 3 - February 14 - Loops, Logic, and Lists}
+
Boolean expressions \& operators; conditional flow statements; iterable types; strings as iterables; loops.  
Boolean expressions; conditional flow statements; loops; lists; iteration; examination of simple design patterns. Homework: stopwatch program.
+
\subsection*{Week 4 - February 20, 2014 - Functions \& Debugging}
\subsection*{Week 4 - February 21 - Functions and Debugging}
+
Namespaces; Global vs. local scope; Functions: definitions, arguments, return statement, decorators; Coding Style; Debugging: how to invoke, commands. \textbf{\textit{Last day to withdraw from class with 60\% refund is Friday, February 21.}}
Global vs local variables; return statement; arguments; default arguments; positional vs keyword arguments; recursion; Invoking debugger; Using debugger commands. Homework: TBA
+
\subsection*{Week 5 - February 27, 2014 - Slice, Dice, Combine \& Sort}
\subsection*{Week 5 - February 28 - Advanced Collections and String Manipulation}
+
File IO; Parsing strings, List comprehensions; Sorting using a lambda function; string interpolation/formatting; Case study: Gene list with Z-scores.
Examination of iterable types: lists, tuples, dicts, and sets; standard iterable operations; List comprehensions; String operations. Homework: make dict of DNA codon table and translate nucleotide sequence.
+
\subsection*{Week 6 - March 6, 2014 - Pattern Matching (Regular Expressions)}
\subsection*{Week 6 - March 7 - Reading Files and Advanced Sorting}
+
Why they are needed; Example Usage; Python RegExp workflow; Metacharacters; Character Classes; Quantification; Grouping; Back-referencing; Lazy vs. Greedy. \textbf{\textit{Drop dead day to withdraw from class (40\% refund) is Friday, March 7.}}
File I/O; read, write, and append modes; file parsing strategies; Sort operations; Anonymous "Lambda" functions. Homework: Parsing GenBank files.
+
\subsection*{Week 7 - March 13, 2014 - Classes and Object Oriented Programming, Part I}
\subsection*{Week 7 - March 14 - Regular Expressions}
+
Definition; Philosophical underpinnings; Syntax of defining classes; Methods; Instance variables vs. class variables.
Pattern matching; greedy vs. non-greedy; grouping; substitution; back references. Homework: TBA
+
\subsection*{Week 8 - March 20, 2014 - Classes and Object Oriented Programming, Part II}
\subsection*{Week 8 - March 21}
+
\verb|__init__(self)| constructor; @classmethod decorator; Magic methods, a.k.a. object hooks; Inheritance; Abstract base classes.
\begin{itemize}
+
\subsection*{Week 9 - March 27, 2014 - Recursion \& Tree Traversal}
  \item Screencast: Classes and Object-Oriented Programming. Instance attributes and methods; Inheritance; Interfaces; special underscore methods. Homework: TBA
+
Call stack; recursive functions; NCBI taxonomy tree traversal; Systems biology tree traversal; Case Study: Guess my girlfriend's name. \textbf{\textit{Drop dead day to change credit/audit status is Friday, March 28.}}
  \item Class lecture by Ben Busby: Bash shell scripting, part 1 (joint meeting with Perl class)
+
\subsection*{Week 10 - April 3, 2014 - Machine Learning}
\end{itemize}
+
Cross-validation; Training error vs. testing error; Classification: Support Vector Machines; Regression: Elastic Net; Dimensionality Reduction: PCA; Clustering: KMeans.
\subsection*{Week 9 - March 28}
+
\subsection*{Week 11 - April 10, 2014 - Data Visualization}
\begin{itemize}
+
matplotlib; vincent; vega; D3.js; HTML5 Canvas element
  \item Screencast: Python utilities and using Python as "glue". \texttt{shutil}; \texttt{pickle}; parsing command line arguments; reading and manipulating directories; STDIN, STDOUT, and STDERR; Take output from one program and pipe it into another. Homework: TBA
+
\subsection*{Week 12 - April 17, 2014 - Sequence Alignment using Biopython}
  \item Class lecture by Ben Busby: Bash shell scripting, part 2 (joint meeting with Perl class)
 
\end{itemize}
 
\subsection*{Week 10 - April 4 - Basic Statistics Techniques}
 
Creating graphs; Normal distribution; t-tests; Kernel smoothed probability distributions. Homework: TBA
 
\subsection*{Week 11 - April 11 - Manipulating GWAS data}
 
Guest lecture by Jun Ding.
 
\subsection*{Week 12 - April 18 - Sequence Alignment using Biopython}
 
 
Lecture by Matt Shirley.
 
Lecture by Matt Shirley.
\subsection*{Week 13 - April 25 - Querying aligned short read sequence using pysam}
+
\subsection*{Week 13 - April 24, 2014 - Querying aligned short read sequences using pysam}
 
Lecture by Matt Shirley.
 
Lecture by Matt Shirley.
\subsection*{Week 14 - May 2 - MySQL Database Queries}
+
\subsection*{Week 14 - May 1, 2014 - Manipulating GWAS data using R}
Lecture by Matt Shirley.
+
Guest lecture by Jun Ding.
\subsection*{Week 15 - May 9 - Final Project Presentations}
+
\subsection*{Week 15 - May 8, 2014 - Final Project Presentations}
\subsection*{Week 16 - May 16 - Final Project Presentations}
+
\subsection*{Week 16 - May 15, 2014 - Final Project Presentations}
 
\end{document}            % End of document.  
 
\end{document}            % End of document.  
 +
 
</pre>
 
</pre>

Latest revision as of 14:34, 27 March 2014


\documentclass{article}
\usepackage{hyperref}       % Any usepackages must be mentioned after the \documentstyle stmt


                           % The preamble begins here.
\title{BIOF 309 - Introduction to Python}  % Declares the document's title.
\date{Spring 2014}   % Deleting this command produces today's date.
\author{Chris Coletta (lead instructor) - \href{mailto:christopher.coletta@gmail.com}{\texttt{christopher.coletta@gmail.com}} \\ 
              Matt Shirley - \href{mailto:mdshw5@gmail.com}{\texttt{mdshw5@gmail.com}} \\
              Anatoly Dryga - \href{mailto:anatoly.dryga@nih.gov}{\texttt{anatoly.dryga@nih.gov}} \\
              Ben Busby - \href{mailto:ben.busby@gmail.com}{\texttt{ben.busby@gmail.com}} }    % Declares the author's name.


\begin{document}           % End of preamble and beginning of text.

\maketitle                 % Produces the title.


\begin{itemize}
   \item  Class forum: \href{https://groups.google.com/d/forum/faes_biof309_spring2014}{\texttt{https://groups.google.com/d/forum/faes-biof309-spring2014}}
   \item  Group email address: \href{mailto:faes-biof309-spring2014@googlegroups.com}{\texttt{faes-biof309-spring2014@googlegroups.com}}
\end{itemize}
\centerline{\textit{This document last revised on Monday, January 27, 2014}}
\section{Course Description}
This course is intended to teach research professionals without a background in programming to write programs to gain insight into data. In addition to covering tools and syntax that are specific to Python, the class will cover elementary concepts that are ubiquitous in modern software engineering, including object-oriented programming, regular expressions, reading from and writing to text files, recursion, use of the debugger, etc. The end of the course will focus on potential applications of the Python language to bioinformatics, including sequence analysis, machine learning, and data visualization.
\section{Logistics}
The class will meet sixteen times during the semester at 5:30pm every Thursday from January 30, 2014 until May 15, 2014. Our classroom will be in Building 10 on the main NIH campus in Bethesda, room B1C211. Many lectures will be recorded ahead of time as a screencast which students can watch over the Internet at their leisure. It is hoped that this will free up the meeting time to serve as a discussion section or a time to do class exercises, as well as allow the student to learn at his or her own pace. It is the responsibility of the student to watch the lecture before coming to class in addition to submitting homework assignments. \emph{Students are encouraged to use the Google groups email list (link above) to ask for help or pose a question, rather than email the instructor directly.} 
\section{Required Materials}
Each student is required to have a computer running Python 2.7. Instruction will be given using Python 2.7, while syntactic differences between 2.x and 3.x will be highlighted. When installing Python, students are recommended to use the free \href{https://store.continuum.io/cshop/anaconda}{Anaconda Scientific Python Distribution}, which bundles core Python with many essential \& useful 3rd party modules. Choice of operating system is left to the student's discretion. 
\subsection{Optional Textbooks}
There is no required textbook for this course, but students may find these textbooks useful:
\begin{itemize}
   \item Downey, Allen B. Think Python. \href{http://www.greenteapress.com/thinkpython/html/index.html}{Available free online}
   \item Model, Mitchell L. Bioinformatics Programming Using Python. 2009, O'Reilly Media.
   \item McKinney, Wes. Python for Data Analysis. 2nd ed. 2013, O'Reilly Media.
   \item Jones, Martin. Python for Biologists. 1st ed. 2013, CreateSpace Independent Publishing Platform.
\end{itemize}
\section{Grading} 
\subsection{Homework}
Homework will count for 80\% of the final grade. Homework should be submitted via email attachment to the instructor who assigned it before the beginning of class on the day it is due. \emph{N.B.: ANY FILES SUBMITTED MUST ADHERE TO THE FOLLOWING NAMING CONVENTION: lastname\_firstinitial\_hw\#.py, i.e., no spaces, name appears first and assignment number appears second in the filename.} Homework must be able to be executed using a Python 2.7 interpreter -- submitted programs written in Python 3.x is not acceptable. Grading will be based on the following rubric:\\ \\
Program runs, contains useful comments, meaningful variable names, follows "Pythonic" coding conventions : A \\
Program runs, produces correct result: B \\
Program runs, produces something close to the correct result: C \\
Program runs, does not produce correct result: D \\
Program does not run: F
\subsection{Final Project}
The final project will count for 20\% of the final grade. Each student will write a program to solve a problem relevant to their work, studies or interests. The student will have wide latitude in establishing the parameters of their project, with the expectation that the program will do one or more of the following: organize, manipulate, analyze, visualize, interpret some data set, or perform a calculation. Each student be responsible for determining and writing up the requirements for the program (what will it do? who will use it?), and making a five minute presentation about their project to the class at the end of the semester.
\section{Schedule}  
\subsection*{Week 1 - January 30, 2014 - Introduction and Installation}
Housekeeping issues; How to get help; Discussion of programming language ecosystem and where Python fits in; What makes Python distinctive; The Python data analysis stack: core Python + essential 3rd party modules; Setup of Environment; Using Python interactively via IPython notebook; Running a program. Homework: Email me the magic number.
\subsection*{Week 2 - February 6, 2014 - Python Primatives}
Exceptions. Named Values(Variables); Core Python types; Conversion between types; Math expressions; Matrix operations using NumPy; Strings, with escape characters. \textbf{\textit{Last day to withdraw from class with full refund is Friday, February 7.}}
\subsection*{Week 3 - February 13, 2014 - Logic, Lists, and Loops}
Boolean expressions \& operators; conditional flow statements; iterable types; strings as iterables; loops. 
\subsection*{Week 4 - February 20, 2014 - Functions \& Debugging}
Namespaces; Global vs. local scope; Functions: definitions, arguments, return statement, decorators;  Coding Style; Debugging: how to invoke, commands.  \textbf{\textit{Last day to withdraw from class with 60\% refund is Friday, February 21.}}
\subsection*{Week 5 - February 27, 2014 - Slice, Dice, Combine \& Sort}
File IO; Parsing strings, List comprehensions; Sorting using a lambda function; string interpolation/formatting; Case study: Gene list with Z-scores.
\subsection*{Week 6 - March 6, 2014 - Pattern Matching (Regular Expressions)}
Why they are needed; Example Usage; Python RegExp workflow; Metacharacters; Character Classes; Quantification; Grouping; Back-referencing; Lazy vs. Greedy. \textbf{\textit{Drop dead day to withdraw from class (40\% refund) is Friday, March 7.}}
\subsection*{Week 7 - March 13, 2014 - Classes and Object Oriented Programming, Part I}
Definition; Philosophical underpinnings;  Syntax of defining classes;  Methods; Instance variables vs. class variables.
\subsection*{Week 8 - March 20, 2014 - Classes and Object Oriented Programming, Part II}
\verb|__init__(self)| constructor; @classmethod decorator; Magic methods, a.k.a. object hooks; Inheritance; Abstract base classes.
\subsection*{Week 9 - March 27, 2014 - Recursion \& Tree Traversal}
Call stack; recursive functions; NCBI taxonomy tree traversal; Systems biology tree traversal; Case Study: Guess my girlfriend's name. \textbf{\textit{Drop dead day to change credit/audit status is Friday, March 28.}}
\subsection*{Week 10 - April 3, 2014 - Machine Learning}
Cross-validation; Training error vs. testing error; Classification: Support Vector Machines; Regression: Elastic Net; Dimensionality Reduction: PCA; Clustering: KMeans.
\subsection*{Week 11 - April 10, 2014 - Data Visualization}
matplotlib; vincent; vega; D3.js; HTML5 Canvas element
\subsection*{Week 12 - April 17, 2014 - Sequence Alignment using Biopython}
Lecture by Matt Shirley.
\subsection*{Week 13 - April 24, 2014 - Querying aligned short read sequences using pysam}
Lecture by Matt Shirley.
\subsection*{Week 14 - May 1, 2014 - Manipulating GWAS data using R}
Guest lecture by Jun Ding.
\subsection*{Week 15 - May 8, 2014 - Final Project Presentations}
\subsection*{Week 16 - May 15, 2014 - Final Project Presentations}
\end{document}             % End of document.