Gas station without pumps

2015 September 1

Pedagogy for bioinformatics teaching

Filed under: Circuits course — gasstationwithoutpumps @ 10:48
Tags: , , , , ,

I was complaining recently about the dearth of teaching blogs in my field(s), and serendipitously almost immediately afterwards, I read a post by lexnederbragt Active learning strategies for bioinformatics teaching:

The more I read about how active learning techniques improve student learning, the more I am inclined to try out such techniques in my own teaching and training.

I attended the third week of Titus Brown’s “NGS Analysis Workshop”. This third week entailed, as one of the participants put it, ‘the bleeding edge of bioinformatics analysis taught by Software Carpentry instructors’ and was a unique opportunity to both learn different analysis techniques, try out new instruction material, as well as experience different instructors and their way of teaching. …

I demonstrated some of my teaching and was asked by one of the students for references for the different active learning approaches I used. Rather then just emailing her, I decided to put these in this blog post.

It is good to see someone blogging about teaching bioinformatics—there aren’t many of us doing it, and most of us are more focused on research than on our pedagogical techniques.  For that matter, in my bioinformatics courses, I’ve only been making minor tweaks to my teaching techniques—increasing wait time after asking questions, randomizing cold calls better, being more aware of the buildup of clutter on the whiteboard, … .  Where I’ve been focusing my pedagogic attention is on my applied electronics course and (to a lesser extent) the freshman design seminar.

I’ll be starting my main bioinformatics course in just over 3 weeks, a first-quarter graduate course that is also taken by seniors doing a BS in bioinformatics.  This will be the 14th time I’ve taught the course (every year since 2001, except for one year when I took a full-year sabbatical).  Although the course has evolved somewhat over that time, it is difficult for me to make major changes to something I’ve taught so often—I’ve already knocked off most of the rough edges, so major changes will always seem inferior, even if they would end up being better after a year or two of tweaking.  I think that major changes in the course would require a change of instructor—something that will have to be planned for, as I’ll be retiring in a few years.

My main goals in this core bioinformatics course are to teach some stochastic modeling (particularly the importance of good null models), dynamic programming (via Smith-Waterman alignment), hidden Markov models, and some Python programming.  The course is pretty intense (the Python programming assignments take up a lot of time), but I think it sets the students up well for the subsequent course in computational genomics (which I do not teach) and for general bioinformatics programming in their research labs. I don’t cover de Bruijn graphs or assembly in this course—those are covered in subsequent courses, though both the exercises Lex mentions seem useful for a course that covers genome assembly.

The live-coding approach that Lex mentions in his blog seems more appropriate for an undergrad course than for a grad course.  I do use that approach for teaching gnuplot in my applied electronics course, though I’ve had trouble getting students to bring their data sets and laptops to class to work on their own plots for the gnuplot classes—I’ll have to emphasize that expectation next spring.

It might be possible to use a live-coding approach near the beginning of the quarter in the bioinformatics course—on the first assignment when I’m trying to get students to learn the “yield” statement for make generators for input parsing. I’ve been thinking that a partial worked example would help students get started on the first program, so I could try live coding half the assignment, and having them finish it for their first homework.

One of the really nice things about Python is how easily one can create input handlers that spit out one item at a time and how cleanly one can interface them to one-pass algorithms. Way too many of the students have only done programming in a paradigm that reads all input, does all processing, and prints all output.  Although there are some bioinformatics programs that need to work that way, most bioinformatics tasks involve too much data for that paradigm, and programs need to process data on the fly, without storing it all.  Getting students to cleanly separate I/O from processing while processing only one item at time is the primary goal of the first two “warmup” Python programs in the course.

One thing I will have to demonstrate in doing the live coding is writing the docstring before writing any of the code for a routine.  Students (and professional programmers) have a tendency to code first and document later, which often turns into code-first-think-later, resulting in unreadable, undebuggable code. I should probably make a bigger point of document-first coding in the gnuplot instruction also, though the level of commenting needed in gnuplot is not huge (plot scripts tend to be fairly simple programs).

2013 July 4

AP Computer Science MOOC

Filed under: Uncategorized — gasstationwithoutpumps @ 12:14
Tags: , , , , ,

One approach that is being tried next year to get around the lack of CS instructors in high schools is the first AP Computer Science MOOC.

I have no idea how well this MOOC will work—online education for high schoolers has rather mixed results.  A number of home-schooled students are relying on college-level MOOCs for their instruction, but the drop-out rate is large and the amount of feedback they get usually too little for high school students (probably too little for college students also).

At least amplify.com got an experienced high-school AP CS teacher to teach the MOOC:

Rebecca Dovi has been teaching high school computer science for over 16 years.

She currently teaches in Hanover County, Virginia where she heads the computer science curriculum committee. She is among 10 secondary school teachers nationwide selected to pilot the new CS Principles course under development by College Board.

One of the other concerns with MOOCs, the lack of verifiable measures of student achievement, is alleviated with this course, as the AP CS A exam provides a fairly well-accepted means of final assessment.

My main concern would be whether students get enough feedback on their programming assignments to learn how to structure and document programs properly—something that is labor-intensive but essential for students to really learn the material properly.  Unfortunately, that is not something easily measured on a 3-hour test like the AP exam, so even decent results on the exam may not tell us whether the students are learning as much as they ought to.  (Of course, the same can be said of in-person AP CS courses—we have no guarantees that the students have learned anything not tested on the exam.)

I think that AP CS does make a good test case for high-school MOOCs—there are few places currently teaching computer programming in high school, and an online course is better than no course.  Aligning the MOOC to the AP test makes it more attractive to high school students and more likely to get high school credit than a random CS MOOC.

Because there are so few high schools teaching CS, the MOOC is not going to displace many teachers using better teaching techniques.

2013 July 3

In defense of programming for physics and math teachers

Filed under: Uncategorized — gasstationwithoutpumps @ 09:14
Tags: , , , ,

In response to a comment I made on his blog, Mark Guzdial wrote

I am complete agreement that computing should really be taught within teachers’ disciplines, such as math or physics. Computing is a literacy. We write and do mathematics in science. We should also do computing in science.

Current constraints make that hard to get to.

  • Why should mathematics or teachers want to use computing? It’s harder (in the sense, that it’s something new to learn/use). And it doesn’t help them with their job. Remember the posts I did on Danny Caballero’s dissertation? Computing does lead to mathematics and physics learning, but different from what currently gets tested on standardized tests. Why should people who make up those tests change? To draw more people into computing? Recall how much luck we had getting CS into the new science education frameworks.
  • Who would pay for it? We can get Google to pay for more high school teachers to learn CS — that leads to more computer scientists that they might hire. We can get NSF’s CISE directorate to pay for CS10K — that leads to more CS workers and researchers. Who pays for math and physics teachers to learn computing, especially when learning computing doesn’t help them with their jobs?
  • Finally, in most states, computer science is classified as a business topic. Here in Georgia, the Department of Education did announce that only business teachers could teach computer science. The No Child Left Behind (NCLB) Act requires teachers to be “high qualified” in a subject to teach it. If CS is classified as business, then it makes sense (to administrators that don’t understand CS) that only business teachers are highly qualified to teach it. Barbara Ericson fought hard to get that changed, since some of our best CS teachers are former math and science teachers (who date back before CS became classified as business). I don’t know if, in other states, math and physics teachers are disallowed from teaching CS.

It’s a big, complicated, and not always rational system.

That the system is big and irrational is not news to anyone, and the Georgia Department of Education may be about as silly as Departments of Education get.  I have no idea how to fix dysfunctional government bureaucracies, though, so I won’t comment further on that point.

But I disagree on a couple of things:

  • Learning to use programming effectively can help physics and math teachers do their jobs better.
  • Companies like Google and federal agencies like NSF will pay for teachers to learn computational methods, not just for straight CS teachers.

For the first point, I’m going to have an uphill battle to convince Mark, because he has a carefully done research study that Danny Caballero did for his PhD on his side, and I don’t have 4 years of my life to spend working full time on the question.

I read Mark’s posts about Caballero’s dissertation, I even wrote about them when the posts first came out (and I started another draft post, but abandoned it).  I agree that Caballero’s results are not encouraging, but I don’t believe that a single experiment at 2 sites decides the issue for all time. Caballero showed that a couple of mechanics courses that taught physics using Matter and Interactions did not spend enough time on the concepts of the Force Concepts Inventory (a small but important subset of the concepts of a first physics course), and so students did not learn as much on those topics as in a traditional class. He also showed that students made typical programming errors that reflected poor understanding of both physics and programming, and that students had less favorable attitudes toward computational modeling at the end of the course than at the beginning. The programming errors Caballero found were typical of the errors seen after a first programming course also—if we can’t teach students to avoid those errors when the entire course is focused on programming, it is not surprising that a physics course in which programming is a small add-on also produced students who can’t program well.

Caballero’s thesis study was pretty convincing that those implementations of the intro physics course using computational approaches were not very successful at teaching the concepts of the Force Concepts Inventory. I’m not convinced that the problems are inherent to using computational approaches to teach physics though—just that these courses had not yet been optimized.  It is indeed possible that Mark’s conclusion (computing doesn’t help teach physics or math) is true, but I think that is too big a generalization from Caballero’s results.

Note that an earlier paper on which Caballero was an author showed that the M&I students showed better gains than students in a traditional course on a BEMA test (Electricity and Magnetism, rather than the mechanics topics of the FCI). So even Caballero’s results are not as uniformly negative as Guzdial paints them.

Personally, I liked the Matter and Interactions book, and I think that its approach helped me and my son learn physics better than we otherwise would have, but we’re hardly the typical audience for a first calculus-based physics course, so I don’t want to generalize too much from our experience either—Caballero’s results (positive and negative) are from 1000s of typical students, not 2 very unusual ones.

There are currently teachers in both physics and math looking at programming as a way both to motivate students and to teach physics and math better.  The spread of the ideas in the community is slow, because the teachers are getting little support, either from fellow math and physics teachers or from the computer science community.  People like Mark say “some of our best CS teachers are former math and science teachers”, but also say “it doesn’t help them with their job.”

Teaching physics and math teachers to program can help them do their jobs better—even if they don’t teach programming to students! There are other ways that programming helps them—for example, Matt Greenwolfe spent a lot of time programming Scribbler 2 robots to be better physics lab tools than the usual constant-velocity carts. Other physics teachers are doing simulations, writing video analysis programs (I contributed a little to Doug Brown’s Tracker program), improving data logging and analysis programs, and so forth.  A lot of math teachers are using GeoGebra to make interactive geometry applets (and, more rarely, to have students do some programming in GeoGebra).

As for my second point, there are already many corporate and federal programs to try various ways of improving STEM teaching (the CS portion of that is actually tiny).  To convince them to spend some of that money on teaching math and physics teachers to program, we may need some better use cases than the intro mechanics courses that Caballero studied—or we may just need to re-examine those courses after the instructors have done some optimization based on the feedback from Caballero’s study.

2013 July 1

How can we get more programming taught in high schools?

In the comments on Mark Guzdial’s post Why AP CS:Principles is a good thing: Responding to Gas Station without Pumps (which is a response to my post Millions for a fairly useless new test), an interesting question has arisen: What should a CS teacher know?

I commented

I agree that figuring out what content an intro CS teacher needs to know is important, both in depth and in breadth. If we set the bar too high, there will be no CS teachers in public schools (essentially the current situation). If we set the bar too low, no CS will be taught and we’ll have to undo the damage once the students get to college.

CS as a field is still struggling with how to teach beginners (it is pretty clear that some students learn, but it is not clear to me how much this correlates with what teachers do—but that’s your [that is, Mark Guzdial’s] area of expertise, not mine).

Defining the core competencies that a beginning instructor of beginning students needs seems to me quite difficult. I suppose it starts with deciding what the students need to learn, then figuring out what the teacher needs to be able to do to get them there. I further suppose that this is the intent of the CS Principles course—figuring out the minimal set of essential skills we want out of a first course.

Garth commented

In most schools the CS teacher will also be teaching something else; Math, Science, Art or whatever so the requirements have to be realistic. I think CS Ed would almost have to be a minor, there are just not enough jobs out there yet for a teacher with only a CS Ed major.

So Garth has been thinking of it in terms of new teachers only, it seems. I suspect that we’d get more CS teachers more quickly by summer training for existing math and physics teachers than by trying to train new teachers in ed schools.

Physics teachers could be attracted to programming by a computational modeling curriculum, like the one used in the Matter and Interactions textbook.  Vpython provides a fairly simple entry point for physics teachers and students to write simulations of the sorts useful for AP Physics (both C: Mechanics and B).  I think that a computation-based text for Physics B still needs to be written, as Matter and Interactions definitely requires calculus after the first couple of weeks (or is there already an algebra-based physics book using something like Vpython?).  Once physics teachers become proficient in Vpython, it is not a big stretch for them to teach the CS Principles course (they could even continue to use Vpython for it).

Math teachers could be attracted to programming by summer workshops based around Project Euler, which provides a series of math challenges to be solved by programming (currently 434 such challenges).  Providing them with instruction in a suitable programming language (Python is a good choice for Project Euler) so that they can tackle the math problems would give them the experience programming needed before they would consider teaching programming.  Teaching them to program in Geogebra, free software for doing geometry and algebra presentations and apps, would also be valuable—both for improving their programming skills and for improving their current math courses.

The key point of both these ideas is that we could attract physics and math teachers to programming in order to become better teachers in their current fields.  That they would also become competent to teach beginning CS courses is a bonus.  Even if this approach failed to produce any new CS courses, we would still have improved physics and math teaching.  Given how addictive programming is, I think that we would also find these teachers becoming a force within their schools for creating programming courses, avoiding the current Catch-22: that there are no CS courses because there are no CS teachers, and no CS teachers because there are no courses.

I’ve not addressed in this post the initial question from the comments: what do beginning CS teachers need to know?  One implication of my proposal is that CS teachers need to be able to program.  They don’t need to be fantastically good programmers, nor do they need to know many different programming languages, but they need to be able to program and debug in the language of instruction.  They need to be able to model debugging, and they need to be able to assist students who are stuck (without taking over for the student).  They need to have personally done every assignment they assign, to figure out any ambiguities in the wording of the assignment and to make sure that it is doable with the tools and techniques that the students have been given in their class.

I think that may be enough—I don’t think that beginning CS teachers need to know software development techniques and the intricacies of the development environments and libraries beyond what is essential for the assignments.  They may choose to learn more (some math teachers might enjoy asymptotic analysis of algorithms, for example, and some physics teachers might get into programming robots), but it isn’t necessary for teaching an intro course.

I don’t think we can set the bar any lower: a teacher who can’t program can’t teach programming effectively, and programming should be at the heart of any intro CS course.

2013 June 19

Millions for a fairly useless new test

According to the College Board Press Release: The National Science Foundation Provides $5.2 Million Grant to Create New Advanced Placement® Computer Science Course and Exam.

Innovative College-Level AP® Course Created to Increase Interest in Computing Degrees and Careers, Particularly Among Female and Minority Students

To help ensure that more high school students are prepared to pursue postsecondary education in computer science, the National Science Foundation (NSF) is making a four-year, $5.2 million grant to the College Board’s Advanced Placement Program® (AP®) to fund the creation of AP Computer Science Principles (AP CSP).

The college-level AP CSP course will be introduced into thousands of high schools nationwide in fall 2016, with the first AP CSP Exam set to be administered in May 2017. Unlike computer science courses that focus on programming, AP CSP has been designed to help students explore the creative aspects of computing while also providing a solid academic foundation for understanding the intellectual concepts and practical contributions of computing. AP CSP includes a curriculum framework designed to promote learning with understanding, a digital portfolio to promote student participation throughout the year, and a course and assessment that is independent of programming language.

Successful implementation of the AP CSP course will hinge on the ability to recruit and train qualified teachers with computer science backgrounds to teach the course. Through its CS 10K Project (10,000 computer science teachers in 10,000 high schools by 2016), NSF has been laying the foundation for an unprecedented, national effort to prepare educators to teach this new material using hands-on, inclusive curricula.

The college-level AP CSP course will be introduced into thousands of high schools nationwide in fall 2016, with the first AP CSP Exam set to be administered in May 2017. Unlike computer science courses that focus on programming, AP CSP has been designed to help students explore the creative aspects of computing while also providing a solid academic foundation for understanding the intellectual concepts and practical contributions of computing. AP CSP includes a curriculum framework designed to promote learning with understanding, a digital portfolio to promote student participation throughout the year, and a course and assessment that is independent of programming language.   

I know that Mark Guzdial is fond of the Computer Science Principles (CSP) course that has been prototyped for a few years now at colleges, but I’m not convinced that it represents college-level course work (the supposed intent of AP courses and exams). I don’t know much more about the course than when I blogged about it in 2011, so my opinions in this article may reflect my own lack of knowledge about the course more than anything else.

I’m not saying that CSP is a bad course, or even that it is a bad introduction to computer science, but it seems to me to be at best high-school level. I know that many colleges disagree—the press release says

In a recent survey of 103 of the nation’s top colleges and universities, 87 percent confirmed that AP CSP requires the same content knowledge and skills as the related introductory college course, and 86 percent indicated a willingness to award college credit for qualifying scores on future AP CSP Exams.

There are also colleges that teach high school algebra and precalculus, but we don’t offer AP exams in them.

My own campus has several intro programming courses, some at the level of the AP CSP course.  I suspect that our campus would offer credit in these low-level courses for the AP CSP exam. These lowest-level courses do not count towards any major, though—they provide elective credit for what should be high-school level courses.  The intent (as is apparently the intent for AP CSP) is to provide an extremely low barrier to entry into the field.

I don’t know how well the low barrier to entry works, though.  I’ve not seen much evidence on our campus that the lowest level courses produce many students who continue to take higher level CS courses. Of course, I’ve not tried to get reports on that from the campus academic planning office, as I have enough to do without meddling in the affairs of other departments.  We still have appallingly low numbers of women finishing in CS (and the new game-design major within CS is even more heavily male), so I can’t say that the lower-level intro courses have done much to address the gender imbalance.

The success of CSP also depends on thousands of high schools suddenly deciding to teach the course and getting training for their teachers to do this. I (along with many others) have grave doubts that the schools have the desire or the ability to do this. It is true that the CSP course should be a bit easier to train people for than the current AP CS A course (if only because Java syntax, the core of CS A, is so deadly dull).

Even if, by some miracle, the NSF manages to train 10,000 teachers to program well enough to teach programming, the result is likely to be underwhelming.  I suspect that many will leave teaching—many of the math teacher bloggers I’ve followed who learned to program have moved out of teaching into being full-time programmers.    The result of the 10K project may not be a huge increase in high school CS teachers, but a loss of some of the better math teachers and the production of a core of under-trained programmers.

The justification for the new AP CSP course is that it will drive many more students in computing fields. The College Board continues to confuse correlation with causation:

Research shows that students who took college-level AP math or science exams during high school were more likely than non-AP students to earn degrees in physical science, engineering and life science disciplines — the fields leading to careers essential for the nation’s future prosperity.

Students wanting to do STEM fields in college often chose that path in high school, and took as many STEM courses as they could in order to get into good colleges.  Quite likely, wanting to do STEM in college caused them to take AP exams, not the other way around.

I’m not defending the current AP CS exam—from what I’ve heard about the AP CS A course and exam, it is mainly about Java syntax.  Personally, I think that Java is a poor pedagogical choice for a first programming language (I still favor the sequence Scratch, Python, C, Java), and using it as the language for the AP CS exam forces high schools into poor pedagogy.  The new CSP exam is not supposed to be so language-dependent, which may allow for better pedagogy.

Of course, I’m curious how the exam will be written to be language-independent, and whether it will be able to make any meaningful measurements of what the students have learned.  I’ve never been convinced that exams do an adequate job of measuring programming skills, and I’m not sure what the new exam will measure since the new course is “unlike computer science courses that focus on programming”.

I suspect that the easier AP CSP will replace AP CS A at many high schools, and that CS A will disappear the way that CS AB did in May 2009 (Gresham’s Law for pedagogy: easier courses drive out harder ones).  Whether this is a good or bad outcome depends on how good the AP CSP course turns out to be.

Overall, I’m simply not convinced that the College Board needs federal funding of $5.2 million to develop a new exam.  They are going to make enough money off the new exam that they should be able to fund it without subsidies.

Next Page »