July 05, 2020
David's Tips on How to Read Pytorch
Pytorch has a great design: easy and powerful. Easy enough that it is definitely possible to use pytorch without understanding what it is doing or why. But it also gets better the more you understand.
As part of summer school at MIT, next week I'm doing a lecture to introduce students to pytorch. I have written a few code examples that I hope will give students a head start on understanding the design of pytorch. Each concept is illustrated visually with a cute minimal hackable example. All the examples are notebooks that are hosted on Google Colab.
It covers tensor indexing conventions, benchmarks gpu versus cpu speeds, explains autograd with simple systems, and plots what optimizers are doing using 2d problems. Then I put the pieces together with a detailed discussion of network modules and data loaders, training toy networks where the whole space can be visualized as well as a simple but realistic five-minute ResNet training example.
April 25, 2020
A COVID Battle Map
Whenever Heidi gets a headache after coming back from the hospital, I worry about losing her to COVID.
But I am very aware that, with the virus already so widespread, the decisive battle is no longer being fought by doctors in the hospitals. They are just buying time, containing the threat just like you and I do when we social distance.
The outcome will depend on a race between two global teams furiously trying to hack a dozen proteins. The good guys are thousands of biologists, an historic worldwide collaboration. The bad guys are the random forces of natural selection, the mutations that happen inside each carrier. Thanks to the Bedford lab at Fred Hutchinson, you can see a map of the battlefield here, tracing the random moves made by the bad guys: (data from GISAID)
What are the bad-guy mutations doing? A small study came out of Zhejiang university this week (medrxiv, not peer-reviewed) that hints at the risks as we let the virus propagate and evolve. They did cell-culture studies on 11 samples and found, for example, a 19-fold difference in cell-culture virulence between one version similar to the virus in WA, CA, OR, and VA (not very virulent) compared to one resembling strains in England and France (far more virulent). One of the versions from Wuhan was 249 times worse. (Strains common in NY or Italy were not included.)
So as we celebrate that WA state seems to be beating the virus, this study highlights that WA has just beaten one strain. The European strains spreading elsewhere are different and might actually be more deadly. I think is important to contain covid before an even-worse strain spreads, as happened in 1918.
Happily, in 2020, we can map out a set of weak points that the good guys can counterattack. Here is a survey paper. Some notable targets:
The New York Times has beautiful renderings of all the molecular attack targets.
Unlike in a shooting war, we do not have news reporters going into the battlefield to report on the days wins and losses. But maybe we should. None of these is sure thing. But they all have a chance, and there are real salvos being launched on each of these targets.
On both sides, the battlefield is active.
March 25, 2020
COVID-19 Chart API
Here is the COVID-19 Live Chart API. Use it to create a custom live chart of COVID-19 stats on a linear or logarithmic scale, comparing the set of countries and states that you choose (or an automatically sorted set of worst states or countries), on the timeframe that you want to see.
New 3/27/20: You can now plot local data of most US Counties. Just type the counties, states, and countries you want to see into the search box, and you can make a custom graph focused on the localities you care about.
It is designed to help you use current data to anticipate the future. Click on "advanced options" on covid19chart.org. It just takes a few clicks to make a new visualization.
Once you have created a custom chart, you can email it or print it for your local policymaker. Or better yet, if you are making a dashboard that leaders will see every day, theme the graph dark or light to match your webpage, use the "bare" mode for embedding it as an iframe, like this:
<iframe src="https://covid19chart.org/#/?start=%3E%3D50&include=WA%3BMA%3BNY%3BItaly&top=0&domain=Intl&theme=dark&bare=1" width="500" height="388"></iframe>
(The embedded chart is interactive.)
The data is live, pulled directly off Johns Hopkins CSSE COVID data feed on github. Although that feed is in flux and changes format every few days, I will track their changes and the chart up-to-date as needed. Please email me (email@example.com) if you have any problems with this API.
The current data tell a simple story.
In the US, if we want to avoid a grim future, we need to be making better decisions now. Every state of our country is seeing a similar exponential explosion, just starting on different days. Please use these charts to tell this story. And thank you for helping our leaders understand the importance of our choices today.Continue reading "COVID-19 Chart API"
March 24, 2020
Today marks the beginning of the COVID-19 crisis for me. It is the first day that surgeons are being called in from their regular duties to take care of the wave of COVID-19 patients at MGH. Heidi needs to run into the hospital. We will have weeks or months of this ahead.
I am terrified.
The COVID-19 chart has been updated to include both state-level and international statistics, and it includes an API so that you can make, link, and embed a custom chart that focuses on the states or countries of your choice. The (no doubt stressed-out) CSSE team has been screwing up the data feed, but I will keep the data cleaned and correct on the live chart as long as it can be patched together. Below we can see America first in the chart today.
Please use it as a tool to pressure your local policymakers to take this crisis seriously.
March 23, 2020
No Testing is not Cause for Optimism
Two readings and a thought related to covid-19 testing.
Lack of information requires us to believe two contradictory things at once. From a policy point of view, we need to understand that very few people are infected yet. And from a personal behavior point of view, we need to understand that many people are already infected.
Policy first. Some people think that the lack of testing means that there could be far more asymptomatic cases than we know, and therefore the disease could far less deadly than we imagine.
But consider the case of the town of Vò, near the epicenter of the Italian outbreak, where all 3000 residents were tested. As severe as the outbreak is in Italy, it corresponded to less then 3% of the population being infected. So as bad as the Italian case is, at least in the one town that was tested, it could be 30 times worse. Blindness is not cause for optimism.
Which individuals should be tested? The right behavior is to do the things that maximize lives saved. That means testing should be done in situations where it would change care, for example on on healthcare providers who do not have the option to isolate, so that they do not inadvertently spread it to other providers and patients.
But of course that means many infected people will be untested, so everybody needs to operate under the assumption that we are all infected.
Paradoxically this lack of information means we need to keep in mind two different realities at once. First, we need to recognize that almost nobody has it yet, so the society-wide damage can and will get far far worse; and second, that we and others are likely to have it, so our personal risk and responsibility is very high. We need to isolate.
The parable of two realities corresponds to the logarithmic and linear view of the disaster. I have posted an updated version of the covid-19 time series tracker, which provides both views on covid19chart.org.
March 22, 2020
Two Views of the COVID-19 Crisis
I have posted an interactive chart of USA COVID-19 cases.
This chart lets you see coronavirus data from two different points of view: the policymaker's view, and the doctor's view.
For policymakers, the chart lets you see USA data in the same way the Financial Times COVID-19 plot by John Murdoch compares policies internationally. Select the logarithmic totals with a '>=100' starting threshold, so that "day zero" is the first day there are 100 cases in a state. Over time, if different states' policies have different effects on the growth of the virus, the exponents, and therefore the slopes, will reveal the differences.
The other point-of-view is the doctors-eye view. Doctors must deal with the patients who walk into the ER and who who lie sick in ICU beds. To anticipate these numbers, switch to the 'delta linear' view in the current month. The spikes show why the panic is justified, and why minor policy changes have massive ramifications.
The takeaway? The chart re-emphasizes the point that this is not a game. There is a huge gap between the "policymaker's" view and the "doctor's" reality on the ground. Slight changes from a policymaker's point of view have massive ramifications for doctors.
After our leaders negotiate about a "gradual" shutdown of car factories, Michigan illnesses explode. After beaches stay open for one last lucrative spring break party, Florida cases skyrocket. And what begins as a local outrage will become a healthcare shortage, then a nationwide menace. A single idiotic master of the universe could trigger an outbreak that will use up the ventilators that would have saved your grandfather.
In our 50 states we are all linked. Despite dramatically different local policies, it is likely that our rate of infection growth will be largely the same across the country. In coming days, this chart will tell the story of our national interconnectedness.
Please. We need to take the crisis more seriously than we are. Our corporate, city, state, and federal leaders are not doing enough. "Social distancing" of the coastal elite needs to give way to a much more universal regime of physical isolation, enforced shutdowns, shifting of priorities, deferral of debts, and testing, testing, testing, nationwide.
The graph automatically updates every day based on current data. Please share. And please isolate.
I made the chart to help Heidi (who is a surgeon at MGH) see summaries of some USA statistics that are not being plotted in the media. The code is open-source at github.com/davidbau/covid-19-chart. It is just a bit of HTML and JS, and should be easy to extend to show more information. Pull-requests are welcome.
January 12, 2018
The Purpose of AI
What does it mean for an AI to be good?
Is Omniscience Good?
There are benefits to having computer systems that know everything. For example, yesterday a friend recounted a story about leaving a laptop in a taxi in China. Local police stations in China have a system that can call up any recorded video anywhere in the city, so they used the approximate time of the taxi ride obtain to a video clip of the exact moment of the cab pickup. Soon, they had the plate numbers and called the driver, who promptly found the laptop and returned it to its owner. Today, routine total surveillance in China is coupled with AI systems that constantly sift through the vast stream of data to identify and track every individual person, catalog every interaction, and flag anomalous behavior.
This makes prosecuting crime very easy in China. The court will be presented with a video tape summary of footage of the accused in the hours and days before and after the crime. AI systems, connected to a total surveillance apparatus, are able to automate weeks of police work and create a narrative about why a person is guilty. The same systems also simplify the hard work of putting a rapid stop to uncomfortable social disruptions such as demonstrations and protests.
China has no fourth and first amendments to give them pause, and so that country gives us a glimpse of what is possible with widely available technology today. And maybe it is a picture of humanity's future everywhere. Quiet streets, low crime, no picketing. Never lose a laptop again.
Is that a good thing?
The Purpose of AI
In our pursuit of making AI systems that are more accurate, faster, and leaner, we risk losing the sight of the fundamental design question: What is it for? The systems that we build are complex, with multiple intelligent agents interacting in the system, some human, and some not. So to understand the design of the whole system, we must ask, what is the role of the human, and what is the role of the AI?
Both humans and AI can perceive, predict, and generalize, so there is sometimes a misperception that the two roles are interchangeable. But they are not. Humans stand apart because their purpose is to be the active agents, the deciders. If that is the case, then what is the role of the AI? Can an AI have agency?
There are two forms of interaction between AI behavior and human behavior where agency seems messy.
The problem with optimizing a system around these two design goals is that they presume no role for human agency. It is assumed that a good system will make more accurate predictions - for example the way that Facebook is very good at predicting which thing you will click on next. And it is assumed a good system will be more effective at shaping future behavior - for example the way Google is very good at arranging advertising in a way that maximizes your likelihood of clicking on it.
If a system is designed around those principles alone, the humans are just treated as a random variable to be manipulated, and there is no decision maker in the design. These designs are incomplete. Like any engineered system, an AI is always designed for some purpose. When we do not consider that purpose, the actual decision makers have been erased from the picture.
The proper purpose of an AI is this:
What a Good AI Does
The question of AI goodness comes down to how we can evaluate whether an AI is good or not. We cannot stop at evaluating merely whether an AI is more accurately predictive, or whether it is more effective in achieving an outcome.
We need to be transparent about answering the questions:
For example, with the Chinese surveillance system, the people being observed by the cameras are not making any decisions that are improved by the AI. The people on the street are not the users. The actual users are the people behind the console at the police station: they are the ones whose agency is amplified by the system. They use the system to help decide what to look at, who to call, and who to arrest. To understand whether the AI is good, we need to understand whether it is serving the right set of users, and whether their decisions are improved. That means beginning with an honest discussion of what it means for a police officer to make a good decision. The right answer is likely to be more complicated than a question of crime and punishment.
Most of us work on more prosaic systems. Today I spoke with a researcher who is applying AI to an educational system. She has a large dataset of creations (student programs) made by thousands of students, and she wants to make suggestions to new students about what pieces (program statements) they might want to include in their own creations. In her case, the target user is clearly the student making the creation, and the system is being optimized to predict the user's behavior.
However, the right metric is not predictive accuracy, but whether the user's decisions are improved. That gets to a more difficult discussion of what it means to make a good decision. For example, if most students in her data set share a common misconception about the subject being learned, then the system will optimize predictive accuracy by propagating the same misconception to future students. But that does not amplify the agency of users; it does not improve decision making. Instead, it is exactly the type of optimization that results in an AI that will dull the senses of users.
This is the same problem being faced by Facebook and Google today. Misconceptions, lazy decision making, and addictive behavior are all common human phenomena that are easy to predict and trigger, and so when we optimize systems to maximize accuracy and efficacy in their interactions with humans, the systems solve the problem by propagating these undesirable behaviors. The AI succeeds in solving its optimization by robbing humans of their agency. But this is not inevitable: AI does not need to dehumanize its users.
Building Good AI is Hard
To build good AI, it is not enough to ask our AI to merely predict behavior or shape it reliably. We need to understand how to create AI that helps amplify the ability of human users to make good decisions. We need to understand who the users are and what decisions they make.
In the end, building a good AI means building an authentic understanding of what it means to make a good decision.
December 19, 2017
npycat for npy and npz files
Pytorch, Theano, Tensorflow, and Pycaffe are all python-based, which means that I end up with a lot of numpy-based data and a lot of npy and npz files sitting around my filesystem. All storing my data in a way that is hard to print out. (Why this format?)
Do you have this problem? It is nice to pipe things into grep and sed and awk and less, and, as simple as it is, the npy format is a bit inconvenient for that.
> npycat params_001.npz 0.46768 2.4e-05 2.03e-05 2.3e-05 ... 2.4e-05 7.4e-06 5.1e-06 4.5e-06 2.4e-05 0.46922 0.0002 1.2e-05 ... 5.2e-05 5.9e-05 2.7e-05 5.3e-06 2.6e-05 0.00026 0.59949 8.3e-05 ... 7.4e-06 5.6e-05 5.9e-06 1.3e-05 ... 1.1e-05 8.59e-05 6.4e-05 9.74e-05 ... 2e-05 0.68193 2.2e-05 1.7e-05 5.3e-06 2.8e-05 4.8e-06 8.4e-06 ... 0.00015 1.6e-05 0.49022 2.6e-05 4.8e-06 5.6e-06 1.06e-05 1.5e-05 ... 6.3e-06 1.3e-05 2.68e-05 0.50255 xi: float32 size=6400x6400 0.08672 0.09111 0.07268 0.10268 ... 0.06562 0.0652 0.09805 0.09459 err: float32 size=6400 -0.22102 -0.2293 -0.2118 -0.2582 ... -0.2056 -0.2106 -0.2412 -0.243 coerr: float32 size=6400 None rho: object 0.0001388192177 delta: float64 1 1 1 1 ... 1 1 1 1 theta: float32 size=6400 0.90006 0.90004 0.90002 0.89994 ... 0.89998 0.89999 0.89996 0.89994 gamma: float32 size=6400
By default, all the data is pretty-printed to fit your current terminal column width, with a narrow field width, pytorch-style. But the
Other flags provide a swiss-army knife array of slicing and summarization options, to make it a useful tool for giving a quick view of what is happening in your data files. What is the mean and variance and L-infinity norm of a block of 14 numbers in the middle of my matrix?
> npycat params_001.npz --key=xi --slice=[25:27,3:10] --mean --std --linf 4.91e-06 0.0001 4.9e-06 1.09e-05 1.93e-05 0.000118 1.01e-05 0.000318 2.42e-05 0.000182 9.1e-06 1.88e-05 4.02e-05 0.00011 float32 size=2x7 mean=0.000069 std=0.000087 linf=0.000318
Is that theta vector really all 6400 ones from beginning to end?
> npycat params_000.npz --key=theta --min --max 1 1 1 1 ... 1 1 1 1 float32 size=6400 max=1.000000 min=1.000000
Also npycat is smart about using memory mapping when possible so that the start and end of huge arrays can be printed quickly without bringing the whole contents of an enormous file into memory first. It is fast.
The full usage page:
npycat --help usage: npycat [-h] [--slice slice] [--unpackbits [axis]] [--key key] [--shape] [--type] [--mean] [--std] [--var] [--min] [--max] [--l0] [--l1] [--l2] [--linf] [--meta] [--data] [--abbrev] [--name] [--kname] [--raise] [file [file ...]] prints the contents of numpy .npy or .npz files. positional arguments: file filenames with optional slices such as file.npy[:,0] optional arguments: -h, --help show this help message and exit --slice slice slice to apply to all files --unpackbits [axis] unpack single-bits from byte array --key key key to dereference in npz dictionary --shape show array shape --type show array data type --mean compute mean --std compute stdev --var compute variance --min compute min --max compute max --l0 compute L0 norm, number of nonzeros --l1 compute L1 norm, sum of absolute values --l2 compute L2 norm, euclidean size --linf compute L-infinity norm, max absolute value --meta use --nometa to suppress metadata --data use --nodata to suppress data --abbrev use --noabbrev to suppress abbreviation of data --name show filename with metadata --kname show key name from npz dictionaries --raise raise errors instead of catching them examples: just print the metadata (shape and type) for data.npy npycat data.npy --nodata show every number, and the mean and variance, in a 1-d slice of a 5-d tensor npycat tensor.npy[0,0,:,0,1] --noabbrev --mean --var
December 18, 2017
In Code We Trust?
As world leaders show themselves prone to falsehood, corruption, greed, and malice, it is tempting to find a new authority in which to place our trust. In today's NYT, Tim Wu observes that rise of Bitcoin evidences humanity's new trust in code: "In our fear of human error, we are putting an increasingly deep faith in technology."
But is this faith well-placed if we do not know how code works or why it does what it does?
Trust in AI Today is about Trust in Testing
Take AI systems. Deep networks used to parse speech or recognize images are subject to massive batteries of tests before they are used. And so in that sense they are more scrutinized than any human person we might hire to do the same job. Trusting a highly scrutinized system seems much better than trusting something untested.
But here is one way that modern AI falls short: we do not expect most AIs to justify, explain, or account for their thinking. And perhaps we do not feel the need for any explanation. Even though explainability is often brought up in the context of medical decisions, my physician friends live in a world of clinical trials, and many of them believe that such rigorous testing on its own is the ultimate proof of utility. You can have all the theories in the world about why something should work, but no theory is as important as experimental evidence of utility. What other proof do we need beyond a rigorous test? Who cares what anybody claims about why it should work, as long as it actually does?
Battle: LeCun vs Rahimi
How much faith to place in empirical versus theoretical results is a debate that is currently raging among AI researchers. On the sidelines of the NIPS 2017 conference, a pitched argument broke out between Yann LeCun (the empiricist) and Ali Rahimi (the theoretician), who disagree on whether empirical AI results without a theoretical foundation just amount to a modern form of alchemy.
I side with Rahimi in revulsion against blind empiricism, but maybe I have different reasons than he. I do not worship the mathematics of rigorous theory. I think the relationship with humans is what is important. We should not trust code unless a person is able to understand some human-interpretable rules that govern its behavior.
The Mathematics of Interpretability
There are two reasons that test results need to be complemented by understandable rules. One is mathematical, and the other is philosophical.
Math first. Our modern AI systems, by their nature, respond to thousands of bits of input. So we should hold any claim of thoroughness of testing up against the harsh fact that visiting each of 2^1000 possible input possibilities - just a few hundred bytes of distinctive input state - would require more tests than atoms in the observable universe, even if every atom had a whole extra universe within it. Most realistic input spaces are far larger, and therefore no test can be thorough in the sense of testing any significant portion of the possibilities.
Furthermore, a sample can only accurately summarize a distribution under the assumption that the world never changes. But humanity imposes a huge rate of change on the world: we change our climate rapidly, we disrupt our governments and businesses regularly, we change our technology faster, and whenever we create a new computer system, adversaries immediately try to change the rules to try to beat it.
Testing is helpful, but "exhaustive" testing is an illusion.
The Philosophy of Interpretability
Philosophy next. This impossibility of testing every possible situation in advance is not a new problem: it has been faced by humanity forever (and, arguably, it is also one of the core problems facing all biological life).
It is in response to this state explosion that mankind invented philosophy, law, engineering, and science. These assemblages of rules are an attempt to distill what we think is important about the individual outcomes we have observed, so that when unanticipated situations arise, we can turn to our old rules and make good, sound decisions again. That is the purpose of ethics, case law, and construction standards. That is the reason that the scientific method is not just about observations, but about creating models and hypotheses before making observations.
We should hold our code to the same standard. It is not good enough for it to perform well on a test. Code should also follow a set of understandable rules that can anticipate its behavior.
Humans need interpretable rules so that we can play our proper role in society. We are the deciders. And to decide about using a machine, we need to be able to see whether the model of action used by the machine matches up with what we think it should be doing, so that when it inevitably faces the many situations in a changing world that will have never been tested before, we can still anticipate its behavior.
If the world never changes and the state space is small, mechanisms are not so important: tests are enough. But that is not the purpose of code in the modern world. Code is humanity's way of putting complexity in a bottle. Therefore its use demands explanations.
Are Understandable Models Possible?
This philosophy of rule-making all sounds vague. Is it possible to do this usefully while still creating the magic intelligent success of deep networks?
I am sure that it is possible, although I don't think it is necessarily easy. There is potentially a bit of math involved. And the world of AI may be easier to explain, or it may not. But it is worth a try.
So, I think, that will be the topic of my dissertation!
December 14, 2017
Dear A.G. Schneiderman,
The fraud is particularly infuriating because, as readers of this blog know, I was one of the engineers who devoted two decades of my life to building fundamental Internet technologies....Continue reading "Net Kleptocracy"
November 30, 2017
It's Our Responsibility
It is repulsive how Trump's daily actions transform American power into a force for evil. But we cannot turn away. In the end, it is our country, our system, and our choice. There is no more democratic nation, none with more freedom of speech, none with more vigorous public debate, none with deeper civic institutions. We cannot blame the evil on some tyrant or some invader. We debated, we campaigned, we voted. Trump is the one we chose.
His failure is our failure. It is our responsibility.
The ongoing conversation about Trump's obvious shortcomings misses the point. It is not about him. It is about us. We need to figure out how to remedy our failure as a democracy.
June 28, 2017
Volo Ergo Sum
Descartes had it wrong. Cogito ergo sum - I think therefore I am - was his proposal that skepticism, cognition, and reason are the essence of human existence. While this view was sensible in 1600 as European intellectuals were emerging from an age of superstition, the proposal is ridiculous on its face in the highly engineered 21st century world. Who today would seriously characterize humanity as being defined by our powers of reason? Today we stand at the precipice of human-level AI. And yet when we create machines with broad and deep powers of reason outstripping human cognition, the result is utterly inhuman. To think is not to be.
Volo ergo sum. The alternative is an old idea, a slogan coined by Maine de Biran at the dawn of the first industrial revolution in 1800 when he saw the contradiction in Descartes' proposal.
What does it mean to be human?
Exercising volition with competence is not a trivial thing. Most of us do not know what we really want, or even how to figure it out. We assume, superstitiously, that free will is automatic, that it is what happens when we are left to our animal instincts. But volition is far harder than just doing whatever we feel like. Developing our will means predicting our future selves, identifying not only our hunger today, but our desires tomorrow, our goals for next year, our aspirations for life. It means understanding the interaction of our own aims with the desires of others, our effects on each other, our hopes for society, and our vision for humanity. It means refining our ideals and honing our preferences, recognizing what we see as cute, what is profound, and what is beautiful. And it means knowing how to identify the slim intersection between that which is most desirable and that which is most possible.
Free will is not easy to exercise well: it is a developed skill. But it is a skill that that we leave pitifully untrained in modern society. Our undeveloped sense of purpose comes from the fact that for all our modernity, we still live according to Cartesian values articulated in 1600. We spend 12 or 20 years of schooling to follow the path of Descartes, accumulating knowledge and developing our powers of inquiry and reason. But there is no curriculum that trains our powers of agency.
I believe this omission is the reason modern society is descending into crisis.
June 05, 2017
A Crisis of Purpose
Dear Senator Biden,
In your focus on the dignity of work, I believe you have identified the great political problem facing Americans today. However, I fear that the problem is deeper and more fundamental than you have articulated.
In the U.S., Democrats and Republicans both suffer from the same lack of political leadership. Trump, in all his boorishness, is transparent in his need to be loved by the people even as he plunders the country. But Democratic leaders suffer from the same disease, even if it is less obvious. When you echo the trope that you "work for the people," it reflects a focus on gratitude towards the leaders themselves, the wrong goal completely.
A tip for any leader: it's not about you.
The biggest challenge facing modern Americans is our loss of purpose. Our entire national economic policy is geared towards creating the most efficient means of production, making the machine that lets one person do the work of 50, freeing the 49 to do something else. But this logic takes human efficacy for granted: that is the fallacy faced by the other 49 as they search for their role in life. As a researcher in artificial intelligence, I know what the most efficient systems look like because I build them every day. Unsurprisingly, the most efficient systems do not involve humans.
What does it mean to be human?
It has taken some years for this problem to hit the soft side of our economy. The creative class is safe from automation as long as computers have difficulty generating high-level insights and good writing. And workers who pluck berries are safe from automation as long as machines lack the dexterity of human fingers. But if you think these types of jobs are permanently safe from automation, I encourage you to watch a presentation on the automation of berry-picking. The problem is simpler than it may seem, and the innovations make it clear that the berry-picking profession is soon to vanish. My message from the world of AI research is that high-level insight is also likely to be much simpler than it may seem. The crisis of human purpose which has roiled the manufacturing sector over the last 50 years will become a universal crisis within our children's lifetime.
The need for a renewed human purpose is the reason that improving health insurance fails to animate voters as much as it seems like it should. If the state is willing to care for me and my family even if I become incapacitated, then what is my purpose? Why am I even needed? The same can be said for food stamps, job retraining, universal preschool, parental leave, and a host of other Democratic priorities. These policies make sense if we think the main problem facing society is the efficient production and fair distribution of scarce resources. But in an age of automation, these policies do not demand any crucial sacrifice, and they do not restore the biggest thing which is being taken from humanity in the 21st century: a genuine reason for being.
Therefore, I admonish our politicians to answer this question: Why are people needed? The leader who will steer us out of this century's political mess will be the one who can address the people, articulate a vision for the future, and say,
Your enthusiastic supporter,
May 24, 2017
I love programming, and have made a nice career of it at Google, Microsoft, and startups. But when I got old enough to contemplate what I want to do with the rest of my time on earth, I came to this realization:
But we create computers that program humans instead.
To push against this trend, I turned away from work on the social algorithm of search, and instead began creating tools and lessons to make programming accessible to children.
While child-oriented programming may seem a juvenile escape from the rigors of a competitive business, I think making programming more easily learnable is one of the most important problems facing society today. To avoid feeding a decline of the human condition where people become replaceable by computers, we must make our technology more comprehensible and programmable. Our industry needs to turn its focus away from algorithms that manipulate human behavior, and towards tools that amplify imagination. This means not only changing our technology, but changing the way people know how to use it.
My easy-programming project was called Pencil Code. It was a short book that became a website, and it got going while I was sitting near Professor Hal Abelson at his desk at Google Boston, where he coordinated a similar project, App Inventor. Hal is the author of one of the seminal textbooks in computer science which set the tone for a generation of practitioners, and he continues to lead the charge on issues such as privacy and security and ethics in our field.
Over many discussions with Hal, I came to realize that changing society cannot be done only by making widgets: changing society means articulating the ideas that frame everybody's thinking.
Eventually, determined to make a difference, I retired from Google. I packed up my desk and walked across the street to MIT to pursue a PhD and begin a new career as an idea-maker - a researcher.
A First Semester Realization
At MIT, professor Rob Miller also works on creating programming tools that make programming easier to learn, and he took me under his wing as I set out on that path. My first year would not be spent doing much research - my one academic contribution was to write a review paper surveying the field. Instead my first year as a new student was spent on the array of TQE classes they require for you to broaden your view of the field.
So I sharpened my pencil and re-learned the skill of writing homework and exams. I took classes in security and vision to update my knowledge, but I was left by a feeling that the problem of opaque computing was fundamental to these fields also. Programmers seem intractably blind to security holes; and the remarkable power of deep neural networks seems inextricably linked to their incomprehensibility by humans. I am old enough to see how the field has changed: these problems did not exist when I began in computer science. My conclusion was horrifying.
MIT is a remarkable cauldron for incubating such ideas and putting them to action, so my story will continue next time I have time to write.
May 23, 2017
Government is Not the Problem
Dear Senator Warren,
I write to you because I believe your leadership may help steer this country out of our current national crisis. As impeachment becomes increasingly inevitable, we need our leaders to avoid feeding the disastrous antigovernment philosophy that grew out of the Nixon impeachment.
We need you to need to keep pounding away at the message:
The destruction of government by the Republican party is the problem.
Since the Reagan revolution, the Republican party has worshipped the perverse idea that "government is always the problem," which is an oversimplification and corruption of Reagan's vision that government by the elite is dangerous. Advocating the destruction of government is a politically potent message since nobody likes paying tolls, taxes, or fines. But the message is a cynical repackaging of anarchy that benefits only the rich and powerful, and it is the exact opposite of Reagan's vision. The Trump administration is proof positive that trying to "deconstruct the administrative state" is a disaster for everybody but the most greedy of the elite. Our country is being sacked.
Please - Senator Warren - ideas are important, and we depend on articulate leaders like you to help shape the discourse of our nation.
Sincerely, your constituent and supporter,
May 22, 2017
Interestingly for me, grandpa's travel to the U.S. was during the years of the discriminatory Oriental Exclusion Acts that limited immigration from China to to the U.S. to zero people per year, so I do not know how he entered the United States in 1941.
He was a student from an elite family and not one of the Chinese laborers that congress feared, so maybe he entered under a legal loophole. I wonder, suddenly, if that is why my Anglicized last name has a Germanic spelling, and why my grandfather and grandmother never spoke in Chinese in public, even to each other. Did grandpa enter under a German identity? Did he avoid speaking Chinese to avoid the attention of racist immigration officers? I think entry was probably very tricky, and very few Chinese-American families have the same immigration story and timeline as mine. Entry from China was virtually nil from 1924 to 1943.
Incidentally, when people say Asians are a "model minority" and ask "why are Chinese people so smart?" I think the reason is that for many decades even before and after this period, there were draconian and racist exclusion laws that meant that you needed to be a sophisticated member of the elite, with money and access to lawyers, to navigate the loopholes and enter the United States. This continues to be true today.
Thus Chinese and other Asian immigrants have long been children of the rich, educated elite. No surprise that when they come to the United States, they join the ranks of the rich, educated elite.
May 21, 2017
David Hong-Toh Bau, Sr
I am named after my grandfather, who was the scion of a wealthy Shanghai family and an enterprising young banker in Shanghai and Hong Kong in the 1930's. But in 1941, the intellectually ambitious and multilingual young man grew restless and decided to to embark on a big "Act 2" for his life, leaving the comfort of a privileged life in China to travel to the U.S. to train himself as an international economist.
Act 2: A Mixed-Up Move to America
So, in the summer of 1941 at the age of 28, grandpa made the rare trek from Shanghai to the University of Maryland, together with his pregnant young wife Fanny and his baby daughter Deanna.
I do not know if David H. Bau, Sr. flew to the U.S. on the China Clipper into San Francisco or took a steamer like the Nippon Maru into Los Angeles, but there was certainly no convenient way of physically getting from Shanghai to College Park in 1941. While traveling halfway around the world and traversing the continental United States that summer, my grandmother went into labor. So they stopped in the middle of their trip and delayed their arrival at UMD for a few months to take care of the new baby. My father Paul was the first American-born kid in our family, and it is a fitting designation. Born in Chicago, my dad is really American through and through; he's all about football, poker, stamp collecting, and hamburgers, and he's a dyed-in-the-wool Republican.
Act 3: The American War Effort
But what should happen on December 7 of 1941 as David and Fanny were taking care of little baby Paul? When the Japanese bombed Pearl Harbor and America entered the war on the Pacific front, it brought an instant halt to normal commerce with China, and my grandfather found himself cut off from the funds from his family that would have supported his leisurely life and his graduate studies. He suddenly needed a job to pay the rent for his house in D.C.
So the young graduate student applied for a job at the U.S. war department, where his multilingual skills and knowledge of Asian banking and agricultural economics would come in handy in the fight with Japan. He was a thinker, not a fighter, and so naturally he was recruited as intelligence officer in the OSS, what they used to call the CIA. We don't know much about his job as a spy, but it probably involved the deskwork that would have been needed to wage economic warfare against Japan. To implement an effective blockade, you need to know which types of trade to interrupt and how. You need an Asian economist to study the problem. Due to the exclusion act, my grandfather might have been the only one in the country.
The war years witnessed global turmoil, including the communist revolution in China, during which the family's fortune in China was decimated. There are various old arguments in my extended family of which I am only vaguely aware, but I believe they go back to the stress and strain of dividing up scraps of remaining family wealth from those turbulent years.
My grandfather would recount the non-secret part of his job at the end of the war, which was exactly the opposite of the blockade he might have created during the war. General MacArthur recruited him into the army, and sent him into Japan to lead the agricultural reconstruction of that broken country. My grandfather was responsible for re-feeding Japan and getting its population back on its feet; he says that this was the most rewarding work of his life.
Act 3: International Economist
After the war, I do not know if he completed his graduate studies, but he did achieve his dream of becoming an international agricultural economist, working for the U.N. My father tells stories of a big family trip to Thailand where they lived in an old palace so large that they used to bicycle down the hallways. That must have been 1951: I can see on Google Books a U.N. report grandpa wrote that year called Agricultural Economic Survey of Sarapee District, Chiengmai Province, Thailand.
But then he turned down a senior post with the newly-formalized Food and Agricultural Organization, because he loved Washington D.C. and did not want to move the family to Rome. So my father and my aunt and uncle grew up as Washington D.C. kids. Their family house was just a few steps from the Capitol. To stay in D.C., my grandfather embarked on a new career as an American businessman.
Act 4: American Businessman
Due to the overt racism of the day, there were only a few realistic career avenues open for a midcentury Chinese-American businessman, and one of them was to open a Chinese restaurant. Apparently grandpa opened up two, one in Georgetown and a second one in the comfortable tropical climes of Puerto Rico where my dad graduated from high school (he still loves the island and has many friends there). My grandfather also used to recount stories of trying to become a farmer, unsuccessfully, with the new-wave crop of soybeans, on land in Puerto Rico. He failed at business several times before deciding that business was not for him. Then he moved on to his "Act 5", finding another way to live in his beloved city Washington D.C.
Act 5: Librarian of Congress
He went back to school, spending some time in Ann Arbor to get a degree in Library Science (I can find his graduate research funding support in 1962 and his graudation with a masters from University of Michigan a couple years later). With this training, he became one of the top Asian literature librarians in the country, taking a job around the corner from his D.C. house at the Library of Congress.
I found this 1995 obituary of my grandfather in the Washington Post. It summarized the story of his life after the war years.
DAVID H. BAU Library of Congress Librarian
The elderly senior librarian is the grandpa I remember, and he seemed so very happy in his Act 5. He brought me to his office at the LoC and showed me the shelf where he always had 10 asian-language books that he was speed-reading simultaneously to catalog them. He told me that being a librarian was supposed to be his retirement job, but it was a job that he did longer than any other in his life. He was always full of jokes and energy, and he always had some sort of crazy project going on such as renovating his own bathrooms, or processing his own raw soybeans into other food products in his kitchen.
He died shortly after I was able to introduce him to Heidi, who I married not long after his death in 1995.
Grandpa's many adventures have given me the confidence to try to reinvent myself in my own life. His life was an inspiration, and I still miss him.
May 10, 2017
Dear Senator Collins
You are putting our democracy in danger. Your recent declarations about Trump's firing of Comey are unworthy of a democratically elected Senator, and you have lost my confidence to sit on the Senate intelligence committee and faithfully investigate matters related to Russian collaboration.
With yesterday's firing of the head of the FBI, our president is taking the actions of a corrupt dictatorship. By firing Comey after removing both Sally Yates and Preet Bhara, Trump has now eliminated the third senior official charged with examining corruption in the executive branch. It is clear that he will continue to fire investigators who dare to follow the facts where they lead, as soon as they lead too close to him.
How can you look the American people in the eye and say "Any suggestion that today’s announcement is somehow an effort to stop the FBI's investigation of Russia’s attempt to influence the election last fall is misplaced?"
Trump's continued massacre of our law-enforcement branch is as plain to see as it is when Egyptian leaders recently sacked their anticorruption officials, or when the Chinese communist leadership has imprisoned righteous lawyers. The charges are trumped up.
Previously to today I was a supporter. I believed you to be a smart, honest upstanding New England senator. But your shocking defense of President Trump's dictatorial actions has made it clear that you are either a cynical opportunist or thoroughly corrupt yourself. No patriotic American could love our constitution and also defend Trump's destruction of the Justice department and the FBI.
Consider yourself on this voter's "evil politician" list as of today.
David BauContinue reading "Dear Senator Collins"
David's Tips on How to Read Pytorch
A COVID Battle Map
COVID-19 Chart API
No Testing is not Cause for Optimism
Two Views of the COVID-19 Crisis
The Purpose of AI
npycat for npy and npz files
In Code We Trust?
It's Our Responsibility
Volo Ergo Sum
A Crisis of Purpose
Government is Not the Problem
David Hong-Toh Bau, Sr
Dear Senator Collins
Trump is a Two-Bit Dictator
Beware the Index Fund
Does Watching Fox News Kill You?
Our National Identity
Outrage is Not Enough
A Warning From 1937
A Demon-Haunted World
By the People, For the People
Integrity in Government
Starting at MIT
When to Sell
Making a $400 Linux Laptop
Teaching About Data
Pencil Code at Worcester Technical High School
A Bad Chrome Bug
PhantomJS and Node.JS
Integration Testing in Node.js
Second Edition of Pencil Code
Learning to Program with CoffeeScript
Teaching Math Through Pencil Code
Hour of Code at Lincoln
Hour of Code at AMSA
A New Book and a Thanksgiving Wish
Pencil Code: Lesson on Angles
Pencil Code: Lesson on Lines
Pencil Code: a First Look
CoffeeScript Syntax for Kids
CSS Color Names
For Versus Repeat
Book Sample Page
Teaching Programming and Defending the Middle Class
TurtleBits at Beaver Country Day
Book Writing Progress
Lessons from Kids
Await and Defer
Ticks, Animation, and Queueing in TurtleBits
Using the TurtleBits Editor
Starting with Turtlebits
No Threshold, No Limit
Local Variable Debugging with see.js
Mapping the Earth with Complex Numbers
Conformal Map Viewer
Jobs in 1983
The Problem With China
Made In America Again
Avoiding Selectors for Beginners
Turtle Graphics Fern with jQuery
Learning To Program with jQuery
Python Templating with @stringfunction
PUT and DELETE in call.jsonlib.com
Party like it's 1789
Using goo.gl with jsonlib
Dabbler Under Version Control
Snowpocalypse Hits Boston
Heidi's Sudoku Hintpad
Social Responsibility in Tech
The First Permanent Language
A New Framework For Finance
Lincoln School Construction
Stuck Pixel Utility
Fixing the Deficit
Cancelled Discover Card
Tic Toe Tac
Toe Tac Tic
Tutorial: Root Finder
What SAT Stands For
Bau family website Joe Gary Eric Gayle Reza Ulysses Blossom Howie Nelson Glenn Sacca Davidmay Pop Wag Physics Nature MG LegoEd Cedric Adam Mark Scott Ted Joel XMLBeans Quick Search Bar Battelle Bricklin Digg Jake Gilmour Googlers HotLinks Mini Raymond RB RMack Sam TM Volkh Wonkette Waxy Witt Xooglers Zawodny EconView UChicagoLaw
|Copyright 2020 © David Bau. All Rights Reserved.|