Gas station without pumps

2010 June 30

Google Scholar vs. Web of Science

Filed under: Uncategorized — gasstationwithoutpumps @ 05:00
Tags: , , ,

FemaleScienceProfessor recently posted asking about the differences between Web of Science (now grandiosely called Web of Knowledge) and Google Scholar, the two main competitors for finding scientific citations.  (PUBMED’s “related citations” can be useful for finding similar papers, but does not follow the citation chain forward, the way that Web of Science and Google Scholar do).

For me the differences in what the two sources choose to index and the care they take with the indexing makes a big difference, both in raw citations counts and in number of papers indexed.

According to my CV, I’ve published about 85 papers (42 in journals, 21 in refereed conferences, around 19 or 20 unrefereed tech reports, a book chapter, and a couple of patents).

Google Scholar has “about 205” entries when searching with my name.  Most of these can be mapped to one of my actual papers, but some are vague enough citations that I can understand why Google did not try to merge them.  Google Scholar did manage to find one conference paper that I’d forgotten about (it never made it to my CV, and I’m not even sure I have a copy any more).  It can’t have been a very important paper, though, as no one ever cited it except me.  Some of the low-count citations were simply bogus—students who were too lazy to look up one of my papers and so made up a plausible journal and page numbers (one that I checked was indeed to my work, but they got the wrong title and the wrong journal, with volume numbers that weren’t remotely plausible for the journal and year they claimed).  A handful of the  papers are for a different author with the same last name, and whose middle initial is my first initial.  If I specifically remove his papers (using -author: in the search field), I get down to 191 “papers” cited. The h-index that Google Scholar computes is 31—that is, I have 31 papers that have been cited 31 or more times, but I don’t have 32 that have been cited 32 or more times.  Four of these 31 are conference papers, the others are in conventional journals.

Web of Science has only 41 papers for me (none of which are bogus) and includes 4 for which they have found no citations.  Google scholar has all 4, and even found citations for one of them.  The number of citations is much smaller (my most cited paper has only 596 citations in Web of Science, but 859 in Google Scholar. Web of Science computes my h-index as only 21, mainly due to the lower citation counts it sees (though not indexing the 4 highly cited conference papers doesn’t help). The “distinct author” set for my name has only 26 papers for me.  They claim that I can fix this through ResearcherID, but I’ve had 55 papers there for months, and they haven’t fixed it yet.

Interestingly, one my best known pieces of work is cited only 251 times according to Google Scholar (plus another 17 references to the associated patent), and 64 times according to Web of Science, but gets about 168,000 hits with Google.  Obviously, I’ve not checked all those hits to see what fraction are bogus, but I suspect that well over half of them are real references to the algorithm in the paper.  It is in a field that is not much given to academic citation.

Since I am now in bioinformatics, a lot of my stuff should be indexed in PubMed, but they only found 19 papers for me using my full name, and 34 using my last name and first initial, all of which were indeed mine.  One conference paper that was not published in a journal was included, but only the bioinformatics papers appear.

Citeseer (which used to be the darling of computer scientists, because it included computer science conferences) does a much poorer job of finding my papers. Even trying 4 different variants of my name, it gets “350 citations” and the most cited paper has only 31 citations.

Bottom-line:  Web of Science has the cleanest database, but is missing big chunks of scientific literature.  Google Scholar is the most complete, but has not merged slightly erroneous citations, and is cluttered with bogus information.

2010 June 29

Live-action Math

Filed under: Uncategorized — gasstationwithoutpumps @ 13:37
Tags: , ,

This blog post is part of an essay I wrote back on 2004.  This section describes “live-action math”.

Except for a few particularly complex topics in my graduate bioinformatics course, I do derivations and examples on-the-fly—I refer to this approach as “live-action math”. I believe that the students benefit from seeing problem-solving techniques being used on problems, rather than just seeing canned solutions, and I can usually avoid running down too many blind alleys.

Live-action math is very demanding, as I need to simultaneously solve a tricky math problem (students always ask about the hardest problems), present general problem-solving methods, and make sure I cover the important concepts for the week.  I can’t do live-action math at 8 in the morning, so class scheduling is important for me.

In one freshman course (Applied Discrete Math), I experimented with having my lectures entirely driven by student questions about the examples or exercises in their textbook.  The first two times I tried this, it was not very successful—I covered all the material, but many freshmen were upset by the lack of organization and offended that I expected them to read the book and try the problems before coming to class.  In fact, some were so pointed in their criticism of this approach to teaching that I was once denied a promotion based on their teaching evaluations. Nevertheless, I tried the approach once again, this time in a self-selected “honors” version of the class.  Although the students were not significantly different from the regular section of the class (based on their test scores on a common final exam), they seemed to enjoy the different teaching style, and I got good ratings that quarter.

2010 June 28

Standards-Based Grading

Filed under: Uncategorized — gasstationwithoutpumps @ 08:48
Tags: , , ,

Several of the blogs I read have been discussing Standards-Based Grading (abbreviated SBG) lately.  Perhaps the strongest proponent of this approach as been Mr. Cornally in his blog Think Thank Thunk.

The key ideas of this pedagogic approach seem to be

  • Break your course down into the small, testable ideas: the “standards” you will base your grading on.
  • Tie every assessment to one or two of the standards, so you know exactly what the assessment is measuring.
  • Give the students access to evaluation separately for each standard, so they can see where they need to improve.
  • Allow students to reassess on any standard where they are not satisfied with their performance.

The standards passed down from the state for school teachers are far too vague and far too broad to be usable for SBG, so teachers have to come up with their own.  This is probably the main strength and the main weakness of SBG: teachers structure their courses around what they want the students to be able to do, but they get almost no help from textbooks, curriculum committees, and colleagues in designing their courses.

SBG is based on assessments that test only one thing at a time, which is marginally possible in a reductionist view of math, but almost impossible in other subjects.  I’ve thought about applying the ideas to my college courses, which are mainly senior and graduate courses, but I’ve had no success in coming up with assessments.  For example, one topic I teach is Markov chains: I want students to understand them well enough to be able to write a computer program to train a Markov model from data, then use the model to analyze other data.  The assignment I’ve come up with to do this depends on several programming and writing skills (program design, familiarity with the language used, documentation skills, debugging skills, …), in addition to their understanding of Markov chains.  I know no way to separate these skills to test them independently, and weakness in any of the skills can make it difficult to determine ability in the others. Many of the skills I want most for students to acquire are inherently intertwined with other skills.  Being able to program a computer requires algorithm design, data structure design, coding, documenting, and debugging, none of which are easily isolated from the rest, and I’m not interested in the skills in isolation, but in their combination, since I’m teaching a class that assumes the programming skills are already there, rather than teaching them.

SBG requires that students be frequently informed of their progress in each standard.  Since I have had hard time creating assessments that look at only one thing, I have a hard time informing students of what they do and don’t understand.

Reassessment is the aspect of SBG that I have used for years—I allow students to redo any assignment if they are not satisfied with it.  I do hold students to a higher standard on a redo, though.  It is not enough to fix small bugs or grammatical problems that I have pointed out to them—they must show better understanding of the underlying concepts, generally improved writing, or improved programming skills.   This is not quite the reassessment that SBG expects, but SBG assumes that it is easy to generate multiple independent tests for any standard, and that an assessment requires very little time of either the teacher or the student, so that reassessment is cheap.  When the skills to be assessed are complex, a single assessment may take 10–30 hours of teacher time to create and 10–20 hours of student time to do, making reassessment a fairly costly endeavor.

In short, while Standards-Based Grading seems like a good way to structure a course, I’m not convinced that it is feasible for any of the courses I teach, and is probably only applicable to a small fraction of elementary and secondary-school courses.

2010 June 27

Extracurriculars and homework

Filed under: Uncategorized — gasstationwithoutpumps @ 11:39

Extracurricular activity and homework can often be in competition for a kid’s time, particularly for introverts, who need downtime by themselves to recover from the stress of being with people all day.

My son attended a private school for 7th and 8th grade that has a strong reputation in the arts, particularly singing, theater, and dancing. One of the main attractions for him in going there was their strong drama program. But he ended up never even auditioning for the school plays, because the homework load was so high that he would have had no time for himself if he went to after-school rehearsals.  The school was a 6–12th grade school, and the homework load seemed much higher for the middle schoolers, so that they had little time for extracurriculars.

I thought that the difference might have been one of better time-management skills for the high schoolers, but my son ended up taking half middle-school and half high-school classes in 8th grade, and the homework load was definitely much lower in the high school classes,  with the three high-school classes taking about one-third to one-half the time of 2 of the middle-school classes (the other middle school class, in drama, had almost no homework).

The difference was partly because of the individual teachers. The middle-school history teachers, in particular, took their jobs seriously and taught the students how to research and how to write, as well as a lot of historical information.  The classes were intense and often required more than an hour of homework a night to do properly.  The students got a lot of learning for their effort, though, so it was time well spent. The English classes, on the other hand, were rather fluffy, but had a lot of work to do.  The books they read were good (and the reading took almost no time), but the rest of the English assignments were pretty much time-wasters.  The English teachers demanded papers and all sorts of “craft” projects, but did not provide much guidance on how to create them, nor much feedback on how to improve them.  Teaching writing was left to the history teachers, and writing in English classes seemed to be mainly busywork.

The difference was partly due to subject area, with the writing-intensive history and English classes taking much more time than math, science, or foreign language.

The difference was also driven from the top.  When the school had to downsize at the end of the year (because of shrinking enrollments), the best of the math teachers for advanced students was laid-off.  His belief was that he was the one to go because he assigned too much homework and graded students on what they achieved on tests and homework. My son had him for two classes (geometry and honors algebra 2) and liked his teaching style, but the homework load in those classes very light compared to middle-school classes.  If the administration felt that the homework load in those classes was too high for high-school students, then there was clearly a difference in belief about what was suitable for middle school and what was suitable for high school.

2010 June 26

Homework load

Filed under: Uncategorized — gasstationwithoutpumps @ 16:15
Tags: , ,

One problem that teachers end up thinking about a lot is how much homework to assign and whether to include it in the final evaluations of student performance.  In this blog post, I’ll address only the first question—how much to assign, not how to use it.

For college students, there is a fairly simple rule to follow: being a student is a full-time job. For example, on my campus, the standard full-time load is 15 units.  If we assume that a full-time job is 40–45 hours a week of work, then each unit should translate to 2.7–3 hours of work a week.  That work includes attending class (and labs), doing assigned reading, doing term papers, and doing homework.  Our classes are typically 5-unit classes, with 3.5 hours of lecture a week, leaving 9.8–11.5 hours a week for reading, homework, programming assignments, and term papers.

Although this guideline is simple, guessing how much time students take to do the assignments is often difficult.  For example, a simple programming assignment might take the top students in the class only a couple of hours to do an excellent job on, while the bottom students in the class may spend 10–20 hours to get an only partially-functioning program written.  Reading is also highly variable, with college reading speeds seeming to range from about 30 words per minute to 600 words per minute—and some students never bother to do the reading at all. The workload of the course should not be based on the least competent person in the class, who may not be prepared to do the work no matter how much it is watered down, nor should it be based on the top student in class, as tempting as it is to think “well, G— can do it, so it can’t be asking too much”.

The best approach I’ve found is to do the assignments myself, and assume that the students will take about 3 times as long as I do for each assignment.  Getting feedback from the students on how long specific assignments took, and how much time they spent total on the course can also be useful for adjusting the workload to a reasonable level.  For courses with a larger difference in expected ability between teacher and student, a larger ratio may need to be assumed.

Many middle schools and high schools set up their homework expectations based on “preparing students for college”.  One common rule of thumb one hears is that the total homework load should be 10 minutes per grade level per night (so 10 minutes a night for 1st graders and 2 hours a night for high-school seniors, counting 5 nights a week).  Let’s compare that with the college expectation of  40–45 hours a week of work, including classes.  My son’s middle school had the students for 34 hours a week.  That includes lunch, but many students did homework or met with teachers during lunch, and there wasn’t time for them to go elsewhere, so we’ll count that time as part of their work time (if you insist to treating this a free time, reduce the schooling time to 30 hours a week).  A full-time load for the students would thus involve 6–10 hours a week of homework, which comes pretty close to the 80 minutes/night * 5 nights = 6.67 hours/week. The high school my son is attending next year generally has the students for 27.5 hours a week, so a full-time load would mean 12.5 hours of homework a week, or 150 minutes of homework a night.  Of course, many students at that high school are taking 8 courses a year instead of 6, so if we figure a 45-hour week for them, we get 36.25 hours of school and 8.75 hours of homework for 105 minutes a night (pretty close to the 90–120 minutes a week of the rule of thumb).  First through third grades at one of hte local elementary schools has 28.25 hours of school a week, so with the rule-of-thumb homework load, the kids have 29-hour to 30.75-hour work weeks—about a 3/4 time job rather than a full-time job, but certainly enough for 6–9-year-old kids.

Next Page »

%d bloggers like this: