Gas station without pumps

2015 January 31

Weight-loss progress report

Filed under: Uncategorized — gasstationwithoutpumps @ 11:18
Tags: , , ,

In 2015 New Year’s resolution , I said that I want to lose 10–15 pounds by June 2015. I recognized that New Year’s resolutions rarely last, but that making a public commitment helps people stick with their resolutions.

I’ve been slowly putting on weight for decades (about 1 lb/year since I was 20 years old). I started out very skinny, so the first 20 years probably were getting me to a healthier weight, but I retained my skinny-guy eating habits even as my metabolism slowed with age and I overshot my ideal weight.

When I made my New Year’s resolution, I recognized that successful weight loss requires a change of habits:

Given that I’m unlikely to sustain an increased exercise regime for long enough to lose much weight, it seems like my best bet will be to try to regulate my diet.  …

I’ll try to cut back on some of the high-calorie foods (like cheese and ice cream) and increase my intake of bulky low-calorie foods (like vegetables).  Changing habits that I developed when I was a skinny person is going to be hard, but I’m hopeful that I can reset the weight homeostasis back to what it was a decade ago, and that within six months new dietary habits will be sufficiently established to be able to maintain the weight without struggle.

What I ended up doing was allowing myself to eat any amount of food for lunch, but only raw fruits and vegetables, which have a fairly low calorie density, filling me up without fattening me.  I’ve been generally eating a couple of carrots worth of carrot sticks, a few stalks of celery, some red cabbage or jicama, and an apple for lunch. I’ve stopped eating the tacos from the taco truck (which I miss) and stopped eating snacks from the vending machine (which were never that good anyway).

My evening meal is not restricted by type of food—my wife and I have pretty much been on a “Mediterranean diet” for years, so we didn’t see any reason to change the balance of foods we ate for dinner. Instead, I’ve been trying to control how much I eat by eating slower and stopping before I’m completely stuffed. To keep from feeling deprived of treats, I still allow myself a small amount of chocolate a day (25g of dark chocolate—half of a small Trader Joe’s bar) and occasionally have a mug of hot cocoa (made with non-fat milk, sugar, and Droste cocoa powder).

My exercise levels have remained unchanged, consisting almost entirely of bicycle commuting and weekly bike shopping trips (averaging 4.76 miles/day this month—slightly higher than my long-term average of 3.83 miles/day, but normal for the school term). The intensity of my bicycling depends mainly on how far behind schedule I’m running on my way into work, which is a random variable that is pretty much independent of whether I’m trying to lose weight.

So, one month in, how am I doing?  Let’s look at a plot of my weight over the last few years:

My long-term trend over the past 40 years has been a fairly steady pound per year gain, but last year my rate of increase went up.  The diet is bringing my weight down fairly fast.

My long-term trend over the past 40 years has been a fairly steady pound per year gain, but last year my rate of increase went up. The diet is bringing my weight down fairly fast.

I’m back in the ballpark of what I weighed in 2011, and my rate of weight loss (1.24 lbs/week) is in the 1–2 lbs/week range that CDC recommends for diets that result in a lower stable weight (rather than rebounding).  At the current rate, I need to stay on the diet for another 8 weeks, then switch to a maintenance diet to hold a constant weight.  Of course, it is quite likely that at some point before then my weight will stop decreasing, as my dedication to the weight-loss diet wanes, but things are going well so far.

 

2015 January 27

Build a better life by blogging

Filed under: Uncategorized — gasstationwithoutpumps @ 18:27
Tags: , ,

Today in the mail I got the second swag as a result of my blogging.  (The first was a prototype of the Bitscope DP01 active differential probe, which I blogged about in First blogging swag and Bitscope differential probe).

This time, what I got was American Science and Surplus’s “Customer Thank You Gift Bag”, which contains

The total retail value of this swag is about $10, though I would not have spent that much to acquire any of it.  I might have picked up the mug in a thrift store for 50¢, as I need some more coffee mugs to use as beakers in the circuits course (for the thermistor lab and the electrode lab).  We can use the hot sauce (if it is any good), but the rainbow glasses and the plastic tops will need to be given away.  (I wonder if I should give them to my freshman design students for  good class participation or something, or whether I should have my wife give them to the first-grade teacher at her school.)

The reason that I was sent this rather valueless $10 gift bag is that American Science and Surplus noticed my post about a motor I bought from them a few years ago:

Kudos and thanks to the author – we have been perplexed about this motor for too long. Please contact us at http://www.sciplus.com so we can send you a little gift.
With kind regards from American Science and Surplus

It was nice of them to send me something, but I would rather have had some random motors than what they did send.

At about $85 in swag from 1467 posts, I’m making about 6¢ a post—good thing I’m not doing it for the financial rewards!

2015 January 26

More senior thesis pet peeves

Filed under: Uncategorized — gasstationwithoutpumps @ 22:17
Tags: , , , , ,

I previously posted some Senior thesis pet peeves. Here is another list, triggered by another group of first drafts (in no particular order):

  • An abstract is not an introduction. Technically, an abstract isn’t really a part of a document, but a separate piece of writing that summarizes everything important in the document. Usually the abstract is written last, after everything in the thesis has been written, so that the most important stuff can be determined. Most readers will never read anything of a document but the abstract.
  • Every paragraph (in technical writing) should start with a topic sentence, and the remaining sentences in the paragraph should support and expand that topic sentence. If you drift away from the topic, start a new paragraph! The lack of coherent paragraphs is probably the most common writing problem I see in senior theses.
  • I don’t mark every error I see in student writing. It is the student’s responsibility to learn to recognize problems that I point out and to hunt down other instances themselves. Students need to learn to do their own copy editing (or copy edit each other’s work)—I’m not interested in grading my own copy editing on subsequent drafts of the thesis.
  • Every draft of every document that is turned in for a class or to a boss should have a title, author, and date as part of the documents.  Including this meta-information should be a habitual action of every engineer and every engineering student—I shouldn’t be seeing last-minute hand-scrawled names and titles on senior thesis drafts.
  • Page numbers!  Every technical document over a page long should have page numbers. If you don’t know how to get automatic page numbers with your document processor, either stop using it or learn how!
  • Earlier this quarter I said that I did not care what reference and citation style you used, as long as it was one of the many standard ones. I’ve decided to change my mind on that—I do care somewhat what style you use for the reference list. Use a reference style that contains as much information as possible: full author names, full journal name, dates, locations of conferences, URLs, DOIs, … .  You may format it in any consistent manner, but provide all the information.
  • Use kernel density estimates instead of histograms when showing empirical probability distributions. My previous post explains the reasons.
  • Avoid using red-green distinctions in graphics. About 6% of the male population is red-green colorblind. There are color-blindness simulators on the web (such as http://www.color-blindness.com/coblis-color-blindness-simulator/) that you can use to check whether your color images will work.  Modern gene-expression heat maps use red for overexpression, blue for underexpression, and fade to white in the middle.  This scheme has the advantage of having the strong signals in saturated colors and the weak ones in white or pastels, blending into the white background.
  • Comma usage continues to be a problem for many students. I discussed three common comma situations in English:
    • Comma splices. Two sentences cannot be stuck together with just a comma—one needs a conjunction to join them. If a conjunction is not desired, an em-dash can be used (as in the previous sentence). Sometimes a semicolon can be used, but never a bare comma.
    • Serial comma. There are two different conventions in English about the use of commas before the conjunction in a list of three or more items. In American English, the comma is always required, but in British English the comma is often omitted. I strongly favor the American convention (also known as the serial comma or the Oxford comma), and I will insist on it for the senior theses—even for those students raised in the British punctuation tradition.
    • When using “which” to introduce a relative clause, the clause should be non-restrictive. That is, omitting the clause beginning with “which” should not change the meaning of the noun phrase that is being modified by the relative clause. Non-restrictive relative clauses should be separated from the noun phrase they modify with a comma. If you have “which” without a comma starting a relative clause, then check to see whether you need a comma, or whether you need to change “which” to “that”, because the clause is really restrictive. Note: “which” is gradually taking over the role of “that” in spoken English, but this language change is still not accepted in formal writing, which is more conservative than speech.
  • The noun “however” is a sentence adjective, but it is not a conjunction. You can’t join two sentences with “however”. You can, however, use it to modify a separate sentence that contrasts with the previous one.
  • Colons are not list-introducers. Colons are used to separate a noun phrase from its restatement, and the restatement is often a list. The mistaken notion that colons are list-introducers comes from the following construction: the use of “the following” before a list. The colon is there because the list is a restatement of “the following”, not because it is a list. Note that two sentence back, I used a colon where the restatement was not a list. Similarly, I don’t use a colon when the list is
    • the object of a verb,
    • the object of a prepositional phrase,
    • or any other grammatical construct that is not a restatement or amplification of what came before the colon.
  • Most students in the class use “i.e.” and “e.g.” without knowing the Latin phrases that they are abbreviations for. I suggested that they not use the abbreviations if they wouldn’t use the Latin, but use the plain English phrases that they would normally use: “that is” and “for example”. If they must use the Latin abbreviations, they should at least punctuate them correctly—commas are needed to separate the “i.e.” and “e.g.” from what follows, just as a comma would be used with “that is” or “for example”.
  • Some students use the colloquial phrase “X is where …”, when what they mean is “X is …”. The “where” creeps in in some dialects of English to serve as a way of holding the floor while you think how to finish the sentence—it doesn’t really belong in formal technical writing.
  • “First”, “second”, and “last” are already adverbs.  They don’t need (and can’t really take) an “-ly” suffix. It grates on me the way same way that “nextly” does. “Next” has exactly the same dual status as an adjective and an adverb, but for some reason does not often suffer the indignity of being draped with a superfluous “-ly”.
  • I recommend that students not use the verb “comprise”, as few use it correctly. You can say that “x, y, and z compose A”, “A is composed of x, y, and z”, or “A comprises x, y, and z”. The construction “is comprised of” is strongly frowned on by most grammarists—avoid it completely, and avoid “comprise”, unless its usage comes naturally to you.  “Compose” and “is composed of” are less likely to get you in trouble.
  • “Thus” does not mean “therefore”—”thus” means “in this manner”. Note that “thus” is an adverb, so there is no “thusly”.
  • “Amount” is used for uncountable nouns (like “information”), while “number” is used for countable nouns (like “cells”). There are many distinctions in English that depend on whether a noun is countable or not (the use of articles, the use of plural, “many” vs. “much”), but “number” vs. “amount” seems to be the one that causes senior thesis writers the most difficulty.

2015 January 23

Dress like it’s 1965 Winner

In Dress Like It’s 1965, I showed the clothes that I wore for UCSC’s “Dress Like It’s 1965” Day on Thursday, 15 Jan 2015, to help celebrate the 50th birthday of UCSC (including the marvelous shoes my wife painted). Today I found out that I won 1st place in the men’s category! Pictures of the other winners can be found at http://50years.ucsc.edu/kick-off/.

Here is the picture they took of me, which was used for the judging:

Copied from http://50years.ucsc.edu/css/assets/images/kick-off/winners/1-guy.jpg Sorry, I can't find the photographer's name on the 50th anniversary website to give proper photo credit.

Copied from http://50years.ucsc.edu/css/assets/images/kick-off/winners/1-guy.jpg
Sorry, I can’t find the photographer’s name on the 50th anniversary website to give proper photo credit.

I feel like I cheated a bit, as I was reproducing what I wore in 1969–1971, not 1965. Also I’m wearing a modern digital watch, since I no longer own any analog ones and forgot to take the watch off. But the judges obviously weren’t too fussy.

2015 January 22

Kernel density estimates

Filed under: Uncategorized — gasstationwithoutpumps @ 22:29
Tags: , ,

In the senior thesis writing course, I suggested to the class that they replace the histograms that several students were using with kernel density estimates, as a better way to approximate the underlying probability distribution.  Histograms are designed to be easy to make by hand, not to convey the best possible estimate or picture of the probability density function. Now that we have computers to draw our graphs for us, we can use computational techniques that are too tedious to do by hand, but that provide better graphs: both better looking and less prone to artifacts.

The basic idea of kernel density estimation is simple: every data point is replaced by a narrow probability density centered at that point, and all the probability densities are averaged.  The narrow probability density function is called the kernel, and we are estimating a probability density function for the data, hence the name kernel density estimation.  The most commonly used kernel is a Gaussian distribution, which has two parameters: µ and σ. The mean µ is set to the data point, leaving the standard deviation σ as a parameter that can be used to control the estimation.  If σ is made large, then the kernels are very wide, and the overall density estimate will be very smooth and slowly changing. If σ is made small, then the kernels are narrow, and the density estimate will follow the data closely.

The scipy Python package has a built-in function for creating kernel density estimates from a list or numpy array of data (in any number of dimensions). I used this function to create some illustrative plots of the differences between histograms and kernel density estimates.

This plot has 2 histograms and two kernel density estimates for a sample of 100,000 points.  The blue dots are a histogram with bin width 1, and the bar graph uses bins slightly narrower than 5. The red line is the smooth curve from using Gaussian kernel density estimation, and the green curve results from Gaussian kernel density estimation on transformed data (ln(x+40.))  Note that the kde plots are smoother than the histograms, and less susceptible to boundary artifacts (most of the almost-5-wide bins contain 5 integers, but some have only 4).  The rescaling before computing the kde causes the bins to be wider for large x values, where there are fewer data points.

This plot has 2 histograms and two kernel density estimates for a sample of 100,000 points. The blue dots are a histogram with bin width 1, and the bar graph uses bins slightly narrower than 5. The red line is the smooth curve from using Gaussian kernel density estimation, and the green curve results from Gaussian kernel density estimation on transformed data (ln(x+40.)) Note that the kde plots are smoother than the histograms, and less susceptible to boundary artifacts (most of the almost-5-wide bins contain 5 integers, but some have only 4). The rescaling before computing the kde causes the bins to be wider for large x values, where there are fewer data points.

With only 1000 points, the histograms get quite crude, but kde estimates are still quite good, particularly the "squished kde" which rescales the x axis before applying the kernel density estimate.

With only 1000 points, the histograms get quite crude, but kde estimates are still quite good, particularly the “squished kde” which rescales the x axis before applying the kernel density estimate.

With even more data points from the simulation, the right-hand tail can be seen to be well approximated by a single exponential (a straight line on these semilog plots), so the kernel density estimates are doing a very good job of extrapolating the probability density estimates down to the region where there is only one data point every 10 integers.

Here is the source code I used to create the plots. Note that the squishing requires a compensation to the output of the kernel density computation to produce a probability density function that integrates to 1 on the original data space.

#!/usr/bin/env python3

""" Reads a histogram from stdin
and outputs a smoothed probability density function to stdout
using Gaussian kernel density estimation

Input format:
  # comment lines are ignored
  First two columns are numbers:
	value	number_of_instances
  remaining columns are ignored.

Output format three columns:
  value	 p(value)  integral(x>=value)p(x)
"""

from __future__ import division, print_function

from scipy import stats
import numpy as np
import sys
import itertools
import matplotlib
import matplotlib.pyplot as plt

# values and counts are input histogram, with counts[i] instances of values[i]
values = []
counts = []
for line in sys.stdin:
    line=line.strip()
    if not line: continue
    if line.startswith("#"): continue
    fields = line.split()
    counts.append(int(fields[1]))
    values.append(float(fields[0]))

counts=np.array(counts)
values=np.array(values)

squish_shift = 40. # amount to shift data before taking log when squishing

def squish(data):
    """distortion function to make binning correspond better to density"""
    return np.log(data+squish_shift)

def dsquish(data):
    """derivative of squish(data)"""
    return 1./(data+squish_shift)

instances = np.fromiter(itertools.chain.from_iterable( [value]*num for value, num in zip(values,counts)), float)
squish_instances = np.fromiter(itertools.chain.from_iterable( [squish(value)]*num for value, num in zip(values,counts)), float)
num_points = len(squish_instances)

# print("DEBUG: instances shape=", instances.shape, file=sys.stderr)

min_v = min(values)
max_v = max(values)

squish_smoothed = stats.gaussian_kde(squish_instances)
smoothed = stats.gaussian_kde(instances)

step_size=0.5
grid = np.arange(max(step_size,min_v-10), max_v+10, step_size)

# print("DEBUG: grid=",grid, file=sys.stderr)

plt.xlabel("Length of longest ORF")
plt.ylabel("Probability density")
plt.title("Esitmates of probability density functions")

plt.ylim(0.01/num_points, 0.1)
plt.semilogy(values, counts/num_points, linestyle='None',marker=".", label="histogram bin_width=1")
plt.semilogy(grid,squish_smoothed(squish(grid))*dsquish(grid), label="squished kde")
plt.semilogy(grid,smoothed(grid), label="kde")
num_bins = int(5*num_points**0.25)
plt.hist(values, weights=counts, normed=True,log=True, bins=num_bins,
	label="histogram {} bins".format(num_bins))

plt.legend(loc="upper right")
plt.show()

Next Page »