Gas station without pumps

2015 March 27

Bogus comparison of Word and LaTeX

Filed under: Uncategorized — gasstationwithoutpumps @ 09:36
Tags: , ,

An article was recently brought to my attention that claimed to compare LaTeX to Word for preparing manuscripts: PLOS ONE: An Efficiency Comparison of Document Preparation Systems Used in Academic Research and Development. The authors claim,

To assist the research community, we report a software usability study in which 40 researchers across different disciplines prepared scholarly texts with either Microsoft Word or LaTeX. The probe texts included simple continuous text, text with tables and subheadings, and complex text with several mathematical equations. We show that LaTeX users were slower than Word users, wrote less text in the same amount of time, and produced more typesetting, orthographical, grammatical, and formatting errors.

It turns out to be a completely bogus study—they compared typist or typesetting tasks, not authoring tasks. There was no inserting new figures or equations into the middle of a draft, no rearranging sections, no changing citations styles—not even any writing—just copying text from an existing typeset document. It is very misleading to say that the “LaTeX users … wrote less text”, as none of the subjects were writing, just copying, which uses a very different set of skills.

I don’t think that there is much question that for simply retyping an existing document, a WYSIWYG editor like Word is better than a document compiler like LaTeX, but that has very little to do with the tasks of an author. (And even they noted that the LaTeX users enjoyed the task more than the Word users.)

For those of us who use LaTeX on a regular basis, the benefits do not come from speeding up our typing—LaTeX is a bit slower to work with than a WYSIWYG editor.  The advantages come from things like automatic renumbering of figures and references to them, floating figures that don’t require manual placement (except when there are too many figures—then having to do manual placement with LaTeX is a pain), good math handling, automatic formatting of section and chapter headings, being able to define macros for commonly used actions, and the versatility of having a programming language available. For example, I have a macro that I like to use for proper formatting of conditional probability expressions, and another that I use for references to sections, so that I can switch between “Section 3.2″, “Sec. 3.2″, and “§3.2″ through an entire book with a change to just one line in the file.

LaTeX also has the advantage of having a much longer life span than Word—I can still run 30-year-old LaTeX files and print them, and I expect that the files I create now will still be usable in 30 years (if anyone still cares), while Word files become unusable in only 10-to-20 years.  LaTeX is also free and runs on almost any computer (the original TeX was written for machines that by modern standards were really tiny—64k bytes of RAM).

For those who want multiple-author simultaneous access (like Google Docs), there are web services like sharelatex.com that permit multiple authors to edit a LaTeX document simultaneously. I’ve used sharelatex.com with a co-author, and found it to be fairly effective, though the server behind the rendering is ridiculously slow—40 seconds for  a 10-page document on the web service, while I can compile my whole 217-page textbook three times in about 12 seconds on my 2009 MacBook Pro.

Like the emacs vs. vi wars, the LaTeX vs. Word camps are more about what people are used to and what culture they identify with than the actual advantages and disadvantages of the different tools. Bogus studies like the one in PLoS One don’t really serve any positive function (unless you happen to be a monopoly software seller like Microsoft).

 

Followup on plagiarism

Filed under: Uncategorized — gasstationwithoutpumps @ 08:26
Tags: , , ,

In Plagiarism detected, I mentioned that an article in Nature Biotechnology plagiarizes from my blog, specifically Supplementary Material page 6 from Segmenting noisy signals from nanopores. I got email from the last author this week, explaining the situation:

We saw your recent blog post about our paper and feel that we owe you an explanation.

At the time we read your level-finding blog post we had already implemented a recursive level-finding algorithm that we have been using  in our lab.  Our algorithm made comparison of two data segments using a T-test. We came across your blog and found that the logP value was more useful than the T-test.  We wanted to cite your blog, but Nature’s online publication guidelines made it seem that “Only articles that have been published or submitted to a named publication should be in the reference list” (http://www.nature.com/nature/authors/gta/#a5.4). While we wanted to present our methods as transparently as possible, we had no intention of claiming your work as ours.We should have made efforts to contact you and NBT editors about how to best cite your contribution.

I have contacted NBT to see if a post-publication citation to your blog can be made and I will keep you posted on this.

We noted your recent BioarXiv manuscript and will refer to it in future publications using logP-test level-finders.

So one of the two corrections I was seeking has been met (an apology from the authors), and the other (a citation to the blog) is being sought by the authors. It seems that Nature has a very poor policy about citations, discouraging correct attribution.  Yet another reason to consider them a less desirable family of journals (their rip-off pricing for libraries and their preference for sensational articles over careful research are others).

On a related front, referees for our journal submission of the segmenter paper pointed out that several of the ideas are not new (hardly surprising), and that the basic algorithm has been around for quite a while.  They pointed us to a paper by Killick, Fearnhead, and Eckley (http://arxiv.org/pdf/1101.1438.pdf), which supposedly has an exact algorithm that is as efficient as binary segmentation (which only approximates the best breakpoints). I thank the referees for the pointer—that is the sort of thing peer review is supposed to be good for: pointing out to authors where they have missed relevant prior literature.

I’ve only glanced through the paper (I had 16 senior theses to grade in 4 days, plus trying to get a new draft of my book for my applied electronics course done in time for classes starting next Monday), so I can’t say anything about the algorithm they present, but they do give a citation for the binary algorithm that dates back to 1974:

Scott, A. J. and Knott, M. (1974). A cluster analysis method for grouping means in the analysis of variance. Biometrics, 30(3):507–512.

The online version of the journal only goes back to 1999, so I’ve not confirmed that the paper does contain the same algorithm, but it would not surprise me if it did—the binary split method is fairly obvious once the basics of splitting on log-likelihood are understood.  I had looked for papers on the technique and not found them (which surprised me), but I didn’t look as hard as I should have. I did not find the right entry points to the literature—it is scattered over many different disciplines and I relied too much on the one textbook that I did find to give me pointers. And I didn’t read all the textbook, so I may have missed the appropriate pointers—though they do not cite Scott and Knott, so maybe the textbook authors missed an important chunk of the literature, too.

Now that the Killick et al. paper has given me some useful pointers, I have a lot of reading to do.  I don’t know if I’ll have time before the summer, though—my teaching load starting next week is pretty heavy (I was just noticing that my calendar had 24.5 hours scheduled for the first week, not counting time for prepping for classes, setting up the lab, grading, or revising the book for the electronics class: 7 hours of lecture, 12 hours of lab class, 2 office hours, 1.5 hours meeting with the department manager, 2 hours faculty meeting—and the dean wants to meet with me for half an hour sometime also).

Given that the main idea in our segmenter paper is an old one, for it to be salvageable, we’ll have to shrink the basic algorithm to a brief tutorial (with citations to prior inventors) and concentrate on the little changes made after the basic idea: the parameterization of the threshold setting and the correction for low-pass filtering.  There may be a little bit for applying the idea to stepwise slanting segments using linear regression, but I bet that idea is also an old one, buried somewhere in the literature.

This summer I may want to look at implementing the ideas of the Killick et al. paper (or other similar approaches), to see if they really do produce better segmentation as quickly.

2015 March 18

Freshman design projects moderately successful

I just finished grading this year’s freshman design projects. I think that the projects were more successful this year than last year, in part because I kept the students focussed on electronics and programming (for which they had lab access and which I could help them debug), and in part because the projects were somewhat less ambitious.

There were two groups doing EKGs and 4 groups doing blood pressure meters.  Both EKG groups managed to demonstrate their projects working, as did one of the blood-pressure groups.  (I’m being fairly generous here about what “working” means—they had to get their electronics to work, capture the data, and plot the waveforms, but further interpretation or software was not required.)  The other three blood pressure groups did not manage to demonstrate their projects, but one of them managed to plot waveforms for the pressure measurements (without getting their high-pass filter and amplifier working for the pulse measurements).

Some things I learned for next year:

  • Tell the students what op amp to get.  A number of students picked op amps that turned out to be rather old-fashioned ones with very low input impedance (as low as 2MΩ), rather limited output ranges, and external nulling circuits. The cheap MCP6002 or MCP6004 chips would have worked better at lower cost.  In fact, I gave one group that seemed to have a good schematic (but couldn’t get their circuit to work) an MCP6002 chip, which they wired in place of the op amp they had been using, and their circuit worked immediately.  I would have done the same for other groups, but the others with poorly chosen op amps were about a week behind and did not have circuits that were that close to being functional.
  • Warn students sooner not to use FedEx.  My son’s and my experience with FedEx this year has been that they are ludicrously slow. At least one group was burned by a ridiculously long delivery time, having ordered with FedEx delivery just hours before I warned the class about them.  (The US Post Office is faster and cheaper for lightweight electronics orders from Digi-Key.)
  • Students who never ask questions in class probably don’t understand much that is going on—all the groups that successfully demonstrated their projects had at least one active participant in class.
  • Students who fail to turn in their progress report are almost certainly not going to complete the project on time—I need to be more assertive in getting them moving and demanding that they show me their schematics.  Almost everyone had errors in their schematics on their first design (and one of the successful groups went through 4 incorrect designs before getting to one that worked).  Students that are afraid to show me incorrect or incomplete work don’t get the feedback they need to correct the problems—I need to normalize errors more and insist on seeing stuff, even if it is wrong.
  • The MXP5050DP pressure sensors are very easy for students to use, though a bit pricey at $16 each.  The built-in amplifier makes doing pressure measurements with an Arduino fairly trivial (hook up the three wires of the sensor to A0, +5V, and GND).  They were a good choice for the freshman design seminar, though I’ll continue to use MPX2053DP sensors without an integrated amplifier for the applied circuits class—that assignment is intended to get students to design with an instrumentation amp and to understand a bit about strain gauges.
  • Get the students to plot stuff earlier in the quarter. One group tried installing gnuplot on a Mac in the lab in the last few hours, which did not go well for them.  They did eventually find a plotting program that they could install and run, but then did not have time to run the data they collected through the filtering program I’d written for the class.  Their signals were pretty clean, though, and the plots they produced were good even with just the RC high-pass filter in their amplifier, without digital filtering.
  • The students seemed (for the most part) pretty excited about the projects—even those whose projects didn’t quite work seem to have gotten a lot out of the lab times.  I should look in a couple of years to see how many have stuck with engineering majors (I suspect that some might switch to computer science or computer engineering, rather than sticking with bioengineering, but that’s ok).

2015 March 15

Bruni opinion column on college admissions

Filed under: Uncategorized — gasstationwithoutpumps @ 10:58
Tags: , ,

In How to Survive the College Admissions Madness, Frank Bruni writes consoling advice for parents and high school seniors wrapped up in college admissions and set on going to elite colleges. Although the obsession with elite-or-nothing is more a New York thing than the American universal he treats it as, it is common enough to be worth an opinion column, and he does as nice job of providing a couple of stories that counter the obsession. (No data though—his column is strictly anecdotal, with 5 anecdotes.)

He recognizes that he is really talking to a small segment of the population:

I’m describing the psychology of a minority of American families; a majority are focused on making sure that their kids simply attend a decent college—any decent college—and on finding a way to help them pay for it. Tuition has skyrocketed, forcing many students to think not in terms of dream schools but in terms of those that won’t leave them saddled with debt.

But the core of the advice he gives is applicable to anyone going to college, not just to those seeking elite admission:

… the admissions game is too flawed to be given so much credit. For another, the nature of a student’s college experience—the work that he or she puts into it, the self-examination that’s undertaken, the resourcefulness that’s honed—matters more than the name of the institution attended. In fact students at institutions with less hallowed names sometimes demand more of those places and of themselves. Freed from a focus on the packaging of their education, they get to the meat of it.

In any case, there’s only so much living and learning that take place inside a lecture hall, a science lab or a dormitory. Education happens across a spectrum of settings and in infinite ways, and college has no monopoly on the ingredients for professional achievement or a life well lived.

The elites have some resources to offer that colleges with lesser financial endowments find difficult to match, but any good enough college can provide opportunities to those who look for them.  For some students, being one of the best at a slightly “lesser” institution may result in more opportunities, more faculty attention, and more learning than being just above average in an elite school.  (And, vice versa, of course—moving from being the best in high school to run-of-the-mill at an elite college can also be an important wake-up call.)

Currently, the American college landscape is very broad, offering a lot of different choices with different prices and different strengths.  Unfortunately, many of our state legislatures and governors have decided that only one model should be allowed—the fully private, job-training institution—and are doing everything they can to kill off the public colleges and universities that have been the backbone of US post-secondary education since the Morrill Land-Grant Acts of 1862 and 1890.

The colleges established by the land grant acts were intended as practical places, not primarily social polish for the rich (as most private colleges were then, and most of the elites are now).  The purpose of these public colleges was

without excluding other scientific and classical studies and including military tactic, to teach such branches of learning as are related to agriculture and the mechanic arts, in such manner as the legislatures of the States may respectively prescribe, in order to promote the liberal and practical education of the industrial classes in the several pursuits and professions in life.[7 U.S.C. § 304, as quoted in Wikipedia]

Although agriculture is no longer as large an employer as it was in the 19th century, research in agriculture at the land-grant universities is still driving a major part of the US economy, and engineering (quaintly referred to as “the mechanic arts”) is still a major employer and a primary route for upward social mobility in the US.  The land-grant colleges were explicitly not intended as bastions for the rich to defend their privilege (as our legislators want to make them, by raising tuition to stratospheric levels), but for “liberal and practical education of the industrial classes”—colleges for working-class people.

I think that it would benefit the US for legislatures to once again invest in the “education of the industrial classes in the several pursuits and professions in life” and for parents and students to look seriously at the state-supported colleges, before the madness of privatization wipes them out.

(Disclaimer: I teach at one campus of the University of California, and my son attends another—neither of them land-grant colleges, but both imperiled by the austerity politics of the California legislature, who see their legacy in building prisons and making sure the rich don’t pay taxes, not in providing education for the working class.)

2015 March 14

History of electronics via Google ngrams

Filed under: Uncategorized — gasstationwithoutpumps @ 22:16
Tags: , , ,

I was playing with Google ngrams today (checking to see the whether some variant spellings were ever mainstream) and came up with a history of electronics in one graph:

A short history of electronics in a few key words. At first, power is what mattered, and voltmeters and ammeters ruled. In the 40s, time-varying signals mattered, and oscilloscopes started getting attention. Time-varying signals ruled until digital electronics took over with the introduction of the microprocessor. Now all these low-level views are losing space to consumer-level gadgets like mobile phones.

I could have picked different words, but because Google ngrams provides no way to switch to a log scale for the y-axis (the only sensible way to show growth or decay of word usage), it is not feasible to put a common word like “computer” on the same graph as a rare word like “multimeter”. Google, as always, provides an almost-reasonable product, then never takes the trouble to finish it to allow the user to do things right. Oh, well, it’s free, and that’s the business model Google is relying on: ads on free (almost usable) stuff. The two things they do well are search and selling ads.

« Previous PageNext Page »

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 309 other followers

%d bloggers like this: