Lost in Math: The New York Times Op-Ed

Lost in Math

In July of last year, I wrote a review, “The Perils of Particle Physics,” of Sabine Hossenfelder’s book Lost in Math: How Beauty Leads Physics Astray (Basic Books, June 2018). Lost in Math is a critical account of the disappointing progress in fundamental physics, primarily particle physics and cosmology, since the formulation of the “standard model” in the 1970’s.

Lost in Math
Lost in Math

Dr. Hossenfelder has followed up her book with an editorial “The Uncertain Future of Particle Physics” in The New York Times (January 23, 2019) questioning the wisdom of funding CERN’s recent proposal to build a new particle accelerator, the Future Circular Collider (FCC), estimated to cost over $10 billion. The editorial has in turn produced the predictable howls of outrage from particle physicists and their allies:

Letters to the New York Times from theoretical physicist and science popularizer Jeremy Bernstein and Harvard Physics Professor Lisa Randall

The Worth of Physics Research

Physicists take issue with an Op-Ed article arguing against expensive upgrades to the super collider at CERN.

An article in Slate:

Particle Physics Is Doing Just Fine

In science, lack of discovery can be just as instructive as discovery.

By Chanda Prescod-Weinstein and Tim M.P. Tait

And apparently informal criticism of Dr. Hossenfelder during a recent colloquium and presumably on the physics “grapevine”:

“Maybe I’m crazy”, Blog Post, February 4, 2019

“Particle physicists surprised to find I am not their cheer-leader”, Blog Post, February 2, 2019

Probably there will be additional fireworks.

My original review of Lost in Math covers many points relevant to the editorial. A few additional comments related to particle accelerators:

Particle physics is heavily influenced by the ancient idea of atoms (found in Plato’s Timaeus about 360 B.C. for example) — that matter is comprised of tiny fundamental building blocks, also known as particles. The idea of atoms proved fruitful in understanding chemistry and other phenomena in the 19th century and early 20th century.

In due course, experiments with radioactive materials and early precursors of today’s particle accelerators were seemingly able to break the atoms of chemistry into smaller building blocks: electrons and the atomic nucleus comprised of protons and neutrons, presumably held together by exchanges of mesons such as the pion. The main flaw in the building block model of chemical atoms was the evident “quantum” behavior of electrons and photons (light), the mysterious wave-particle duality quite unlike the behavior of macroscopic particles like billiard balls.

Given this success, it was natural to try to break the protons, neutrons and electrons into even smaller building blocks. This required and justified much larger, more powerful, and increasingly more expensive particle accelerators.

The problem or potential problem is that this approach never actually broke the sub-atomic particles into smaller building blocks. The electron seems to be a point “particle” that clearly exhibits puzzling quantum behavior unlike any macroscopic particle from tiny grains of sand to giant planets.

The proton and neutron never shattered into constituents even though they are clearly not point particles. They seem more like small blobs or vibrating strings of fluid or elastic material. Pumping more energy into them in particle accelerators simply produced more exotic particles, a puzzling sub-atomic zoo. This led to theories like nuclear democracy and Regge poles that interpreted the strongly (strong here referring to the strong nuclear force that binds the nucleus together and powers both the Sun and nuclear weapons) interacting particles as vibrating strings of some sort. The plethora of mesons and baryons were explained as excited states of these strings — of low energy “particles” such as the neutron, proton, and the pion.

However, some of the experiments observed electrons scattering off protons (the nucleus of the most common type of hydrogen atom is a single proton) at sharp angles as if the electron had hit a small “hard” charged particle, not unlike an electron. These partons were eventually interpreted as the quarks of the reigning ‘standard model’ of particle physics.

Unlike the proton, neutron, and electron in chemical atoms, the quarks have never been successfully isolated or extracted from the sub-nuclear particles such as the proton or neutron. This eventually led to theories that the force between the quarks grows stronger with increasing distance, mediated by some sort of string-like tube of field lines (for lack of better terminology) that never breaks however far it is stretched.

Particles All the Way Down

There is an old joke regarding the theory of a flat Earth. The Earth is supported on the back of a turtle. The turtle in turn is supported on the back of a bigger turtle. That turtle stands on the back of a third turtle and so on. It is “Turtles all the way down.” This phrase is shorthand for a problem of infinite regress.

For particle physicists, it is “particles all the way down”. Each new layer of particles is presumably composed of smaller still particles. Chemical atoms were comprised of protons and neutrons in the nucleus and orbiting (sort of) electrons. Protons and neutrons are composed of quarks, although we can never isolate them. Arguably the quarks are constructed from something smaller, although the favored theories like supersymmetry have gone off in hard to understand multidimensional directions.

“Particles all the way down” provides an intuitive justification for building every larger, more powerful, and expensive particle accelerators and colliders to repeat the success of the atomic theory of matter and radioactive elements at finer and finer scales.

However, there are other ways to look at the data. Namely, the strongly interacting particles — the neutron, the proton, and the mesons like the pion — are some sort of vibrating quantum mechanical “strings” of a vaguely elastic material. Pumping more energy into them through particle collisions produces excitations — various sorts of vibrations, rotations, and kinks or turbulent eddies in the strings.

The kinks or turbulent eddies act as small localized scattering centers that can never be extracted independently from the strings — just like quarks.

In this interpretation, strongly interacting particles such as the proton and possibly weakly (weak referring to the weak nuclear force responsible for many radioactive decays such as the carbon-14 decay used in radiocarbon dating) interacting seeming point particles like the electron are comprised of a primal material.

In this latter case, ever more powerful accelerators will only create ever more complex excitations — vibrations, rotations, kinks, turbulence, etc. — in the primal material.   These excitations are not building blocks of matter that give fundamental insight.

One needs rather to find the possible mathematics describing this primal material. Perhaps a modified wave equation with non-linear terms for a viscous fluid or quasi-fluid. Einstein, deBroglie, and Schrodinger were looking at something like this to explain and derive quantum mechanics and put the pilot wave theory of quantum mechanics on a deeper basis.

A critical problem is that an infinity of possible modified wave equations exist. At present it remains a manual process to formulate such equations and test them against existing data — a lengthy trial and error process to find a specific modified wave equation that is correct.

This is a problem shared with mainstream approaches such as supersymmetry, hidden dimensions, and so forth. Even with thousands of theoretical physicists today, it is time consuming and perhaps intractable to search the infinite space of possible mathematics and find a good match to reality. This is the problem that we are addressing at Mathematical Software with our Math Recognition technology.

(C) 2019 by John F. McGowan, Ph.D.

About Me

John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).

How to Tell Scientifically if Advertising Works Explainer Video

AdEvaluator Sales Projections
How to Tell Scientifically if Advertising Works Explainer Video

[Slide 1]

“Half the money I spend on advertising is wasted; the trouble is I don’t know which half.”

This popular quote sums up the problem with advertising.

[Slide 2]

There are many advertising choices today including not advertising, relying on word of mouth and other “organic” growth. Is the advertising working?

[Slide 3]

Proxy measures such as link clicks can be highly misleading. A bad advertisement can get many clicks, even likes but reduce sales by making the product look bad in an entertaining way.

[Animation Enter]

[Wait 2 seconds]

[Slide 4]

Did the advertising increase sales and profits? This requires analysis of the product sales and advertising expenses from your accounting program such as QuickBooks. Raw sales reports are often difficult to interpret unless the boost in sales is extremely large such as doubling sales. Sales are random like flipping a coin. This means a small but profitable increase such as twenty percent is often difficult to distinguish from chance alone.

[Slide 5]

Statistical analysis and computer simulation of a business can give a quantitative, PREDICTIVE answer. We can measure the fraction of days with zero, one, two, or more unit sales with advertising — the green bars in the plot shown — and without advertising, the blue bars.

[Slide 6]

With these fractions, we can simulate the business with and without advertising.

The bar chart shows the results for one thousand simulations of a year of business operations. Because sales are random like flipping a coin, there will be variations in profit from simulation to simulation due to chance alone.

The horizontal axis shows the change in profits in the simulation compared to the actual sales without advertising. The height of the bars shows the FRACTION of the simulations with the change in profits on the horizontal axis.

The blue bars are the fractions for one-thousand simulations without advertising.

[Animation Enter]

The green bars are the fractions for one-thousand simulations with advertising.

[Animation Enter]

The vertical red bar shows the average change in profits over ALL the simulations WITH THE ADVERTISING.

There is ALWAYS an increased risk from the fixed cost of the advertising — $500 per month, $6,000 per year in this example. The green bars in the lower left corner show the increased risk with advertising compared to the blue bars without advertising.

If the advertising campaign increases profits on average and we can afford the increased risk, we should continue the advertising.

[Slide 7]

This analysis was performed with Mathematical Software’s AdEvaluator Free Open Source Software. AdEvaluator works for sales data where there is a SINGLE change in the business, a new advertising campaign.

Our AdEvaluator Pro software for which we will charge money will evaluate cases with multiple changes such as a price change and a new advertising campaign overlapping.

[Slide 8]

AdEvaluator is available on our web site:

mathematical-software.com

[Web Site Animation Entrance]

Click on the Downloads TAB for our Downloads page.

[Web Site Animation Exit]

[Download Links Animation Entrance]

AdEvaluator can be downloaded from GitHub or as a ZIP file directly from the downloads page on our web site.

[Download Links Animation Exit]

Or scan this QR code to go to the Downloads page.

This is John F. McGowan, Ph.D., CEO of Mathematical Software. I have many years experience solving problems using mathematics and mathematical software including work for Apple, HP Labs, and NASA. I can be reached at ceo@mathematical-software.com

###

Technical Article: http://wordpress.jmcgowan.com/wp/how-to-tell-scientifically-if-advertising-boosts-sales-and-profits/

(C) 2019 by John F. McGowan, Ph.D.

About Me

John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).

AdEvaluator on GitHub

Daily Sales Pie Charts
Daily Sales Pie Charts
AdEvaluator™ Daily Sales Pie Charts

https://github.com/MathematicalSoftware/AdEvaluator

AdEvaluator™ evaluates the effect of advertising (or marketing, sales, or public relations) on sales and profits by analyzing a sales report in comma separated values (CSV) format from QuickBooks or other accounting programs. It requires a reference period without the advertising and a test period with the advertising. The advertising should be the only change between the two periods. There are some additional limitations explained in the on-line help for the program.

(C) 2019 by John F. McGowan, Ph.D.

About Me

John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).

How to Tell Scientifically if Advertising Boosts Profits Video

Daily Sales Pie Charts
How to Tell Scientifically if Advertising Boosts Profits Video


Short (seven and one half minute) video showing how to evaluate scientifically if advertising boosts profits using mathematical modeling and statistics with a pitch for our free open source AdEvaluator software and a teaser for our non-free AdEvaluator Pro software — coming soon.

Download the free open source version of AdEvaluator at http://wordpress.jmcgowan.com/wp/downloads/

(C) 2019 by John F. McGowan, Ph.D.

About Me

John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).

How to Tell Scientifically if Advertising Works Video

Daily Sales Pie Charts
AdEvaluator Demo Video (January 1, 2019)

Download: http://wordpress.jmcgowan.com/wp/downloads/

(C) 2019 by John F. McGowan, Ph.D.

About Me

John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).

How to Tell Scientifically if Advertising Boosts Sales and Profits

Annual Profit Projections

“Half the money I spend on advertising is wasted; the trouble is I don’t know which half.”

John Wanamaker, (attributed)
US department store merchant (1838 – 1922)

Between $190 billion and $270 billion is spent on advertising in the United States each year (depending on source). It is often hard to tell whether the advertising boosts sales and profits. This is caused by the unpredictability of individual sales and in many cases the other changes in the business and business environment occurring in addition to the advertising. In technical terms, the evaluation of the effect of advertising on sales and profits is often a multidimensional problem.

Many common metrics such as the number of views, click through rates (CTR), and others do not directly measure the change in sales or profits. For example, an embarrassing or controversial video can generate large numbers of views, shares, and even likes on a social media site and yet cause a sizable fall in sales and profits.

Because individual sales are unpredictable, it is often difficult or impossible to tell whether a change in sales is caused by advertising, simply due to chance alone or some combination of advertising and luck.

The plot below shows the simulated daily sales for a product or service with a price of $90.00 per unit. Initially, the business has no advertising, relying on word of mouth and other methods to acquire and retain customers. During this “no advertising” period, an average of three units are sold per day. The business then contracts with an advertising service such as Facebook, Google AdWords, Yelp, etc. During this “advertising” period, an average of three and one half units are sold per day.

Daily Sales
Daily Sales

The raw daily sales data is impossible to interpret. Even looking at the thirty day moving average of daily sales (the black line), it is far from clear that the advertising campaign is boosting sales.

Taking the average daily sales over the “no advertising” period, the first six months, and over the “advertising” period (the blue line), the average daily sales was higher during the advertising period.

Is the increase in sales due to the advertising or random chance or some combination of the two causes? There is always a possibility that the sales increase is simply due to chance. How much confidence can we have that the increase in sales is due to the advertising and not chance?

This is where statistical methods such as Student’s T test, Welch’s T test, mathematical modeling and computer simulations are needed. These methods compute the effectiveness of the advertising in quantitative terms. These quantitative measures can be converted to estimates of future sales and profits, risks and potential rewards, in dollar terms.

Measuring the Difference Between Two Random Data Sets

In most cases, individual sales are random events like the outcome of flipping a coin. Telling whether sales data with and without advertising is the same is similar to evaluating whether two coins have the same chances of heads and tails. A “fair” coin is a coin with an equal chance of giving a head or a tail when flipped. An “unfair” coin might have a three fourths chance of giving a head and only a one quarter chance of giving a tail when flipped.

If I flip each coin once, I cannot tell the difference between the fair coin and the unfair coin. If I flip the two coins ten times, on average I will get five heads from the fair coin and seven and one half (seven or eight) heads from the unfair coin. It is still hard to tell the difference. With one hundred times, the fair coin will average fifty heads and the unfair coin seventy-five heads. There is still a small chance that the seventy five heads came from a fair coin.

The T statistics used in Student’s T test (Student was a pseudonym used by statistician William Sealy Gossett) and Welch’s T test, a more advanced T test, are measures of the difference in a statistical sense between two random data sets, such as the outcome of flipping coins one hundred times. The larger the T statistic the more different the two random data sets in a statistical sense.

William Sealy Gossett (Student)
William Sealy Gossett (Student)

Student’s T test and Welch’s T test convert the T statistics into probabilities that the difference between the two data sets (the “no advertising” and “advertising” sales data in our case) is due to chance. Student’s T test and Welch’s T test are included in Excel and many other financial and statistical programs.

The plot below is a histogram (bar chart) of the number of simulations with a Welch’s T statistic value. In these simulations, the advertising has no effect on the daily sales (or profits). The advertising has no effect is the null hypothesis in the language of classical statistics.

Welch's T Statistics
Welch’s T Statistics

Welch was able to derive a mathematical formula for the expected distribution — shape of this histogram — using calculus. The mathematical formula could then be evaluated quickly with pencil and paper or an adding machine, the best available technology of his time (the 1940’s).

To derive his formula using calculus, Welch had to assume that the data had a Bell Curve (Normal or Gaussian) distribution. This is at best only approximately true for the sales data above. The distribution of daily sales in the simulated data is actually the Poisson distribution. The Poisson distribution is a better model of sales data and approximates the Bell Curve as the number of sales gets larger. This is why Welch’s T test is often approximately valid for sales data.

Many methods and tests in classical statistics assume a Bell Curve (Normal or Gaussian) distribution and are often approximately correct for real data that is not Bell Curve data. We can compute better, more reliable results with computer simulations using the actual or empirical probability distributions — shown below.

Welch's T Statistic has Bell Curve Shape
Welch’s T Statistic has Bell Curve Shape

More precisely, naming one data set the reference data and the other data set the test data, the T test computes the probability that the test data is due to a chance variation in the process that produced the reference data set. In the advertising example above, the “no advertising” period sales data is the reference data and the “advertising” sales data is the test data. Roughly this probability is the fraction of simulations in the Welch’s T statistic histogram that have a T statistic larger (or smaller for a negative T statistic) than the measured T statistic for the actual data. This probability is known as a p-value, a widely used statistic pioneered by Ronald Fisher.

Ronald Aylmer Fisher
Ronald Aylmer Fisher at the start of his career

The p-value has some obvious drawbacks for a business evaluating the effectiveness of advertising. At best it only tells us the probability that the advertising boosted sales or profits, not how large the boost was nor the risks. Even if on average the advertising boosts sales, what is the risk the advertising will fail or the sales increase will be too small to recover the cost of the advertising?

Fisher worked for Rothamsted Experimental Station in the United Kingdom where he wanted to know whether new breeds of crops, fertilizers, or other new agricultural methods increased yields. His friend and colleague Gossett worked for the Guinness beer company where he was working on improving yields and quality of beer. In both cases, they wanted to know whether a change in the process had a positive effect, not the size of the effect. Without modern computers — using only pencil and paper and adding machines — it was not practical to perform simulations as we can easily today.

Welch’s T statistic has a value of -3.28 for the above sales data. This is in fact lower than nearly all the simulations in the histogram. It is very unlikely the boost in sales is due to chance. The p-value from Welch’s T test for the advertising data above — computed using Welch’s mathematical formula — is only 0.001 (one tenth of one percent). Thus it is very likely the boost in sales is caused by the advertising and not random chance. Note that this does not tell us if the size of the boost, whether the advertising is cost effective, or the risk of the investment.

Sales and Profit Projections Using Computer Simulations

We can do much better than Student’s T test and Welch’s T test by using computer simulations based on the empirical probabilities of sales from the reference data — the “no advertising” period sales data. The simulations use random number generators to simulate the random nature of individual sales.

In these simulations, we simulate one year of business operations with advertising many times — one-thousand in the examples shown — using the frequency of sales from the period with advertising. We also simulate one year of business operations without the advertising, using the frequency of sales from the period without advertising in the sales data.

Frequency of Daily Sales in Both Periods
Frequency of Daily Sales in Both Periods

We compute the annual change in the profit relative to the corresponding period — with or without advertising — in the sales data for each simulated year of business operations.

Annual Profit Projections
Annual Profit Projections

The simulations show that we have an average expected increase in profit of $5,977.66 over one year (our annual advertising cost is $6,000.00). It also shows that despite this there is a risk of a decrease in profits, some greater than the possible decreases with no advertising.

A business needs to know both the risks — how much money might be lost in a worst case — and the rewards — the average and best possible returns on the advertising investment.

Since sales are a random process like flipping a coin or throwing dice, there is a risk of a decline in profits or actual losses without the advertising. The question is whether the risk with advertising is greater, smaller, or the same. This is known as differential risk.

The Problem with p-values

This is a concrete example of the problem with p-values for evaluating the effectiveness of advertising. In this case, the advertising increases the average daily sales from 100 units per day to 101 units per day. Each unit costs one dollar (a candy bar for example).

P-VALUE SHOWS BOOST IN SALES
P-VALUE SHOWS BOOST IN SALES

The p-value from Welch’s T test is 0.007 (seven tenths of one percent). The advertising is almost certainly effective but the boost in sales is much less than the cost of the advertising:

Profit Projections
Profit Projections

The average expected decline in profits over the simulations is $5,128.84.

The p-value is not a good estimate of the potential risks and rewards of investing in advertising. Sales and profit projections from computer simulations based on a mathematical model derived from the reference sales data are a better (not perfect) estimate of the risks and rewards.

Multidimensional Sales Data

The above examples are simple cases where the only change is the addition of the advertising. There are no price changes, other advertising or marketing expenses, or other changes in business or economic conditions. There are no seasonal effects in the sales.

Student’s T test, Welch’s T test, and many other statistical tests are designed and valid only for simple controlled cases such as this where there is only one change between the reference and test data. These tests were well suited to data collected at the Rothamsted Experimental Station, Guinness breweries, and similar operations.

Modern businesses purchasing advertising from Facebook, other social media services, and modern media providers (e.g. the New York Times) face more complex conditions with many possible input variables (unit price, weather, unemployment rate, multiple advertising services, etc.) changing frequently or continuously.

For these, financial analysts need to extract predictive multidimensional mathematical models from the data and then perform similar simulations to evaluate the effect of advertising on sales and profits.

Example Software

The AdEvaluator™ software used in these examples is free open source software (FOSS) developed using the Anaconda Python 3 distribution. It is available under the GNU General Public License Version 3.

AdEvaluator can be downloaded here.

Disclaimer

AdEvaluator™ is designed for cases with a single product or service with a constant unit price during both periods. AdEvaluator™ needs a reference period without the new advertising and a test period with the new advertising campaign. The new advertising campaign should be the only significant change between the two periods. AdEvaluator™ also assumes that the probability of the daily sales is independent and identically distributed during each period. This is not true in all cases. Exercise your professional business judgement whether the results of the simulations are applicable to your business.

This program comes with ABSOLUTELY NO WARRANTY; for details use -license option at the command line or select Help | License… in the graphical user interface (GUI). This is free software, and you are welcome to redistribute it under certain conditions.

We are developing a professional version of AdEvaluator™ for multidimensional cases. This version uses our Math Recognition™ technology to automatically identify good multidimensional mathematical models.

The Math Recognition™ technology is applicable to many types of data, not just sales and advertising data. It can for example be applied to complex biological systems such as the blood coagulation system which causes heart attacks and strokes when it fails. According the US Centers for Disease Control (CDC) about 633,000 people died from heart attacks and 140,000 from strokes in 2016.

Conclusion

It is often difficult to evaluate whether advertising is boosting sales and profits, despite the ready availability of sales and profit data for most businesses. This is caused by the unpredictable nature of individual sales and frequently by the complex multidimensional business environment where price changes, economic downturns and upturns, the weather, and other factors combine with the advertising to produce a confusing picture.

In simple cases with a single change, the addition of the new advertising, Student’s T test, Welch’s T test and other methods from classical statistics can help evaluate the effect of the advertising on sales and profits. These statistical tests can detect an effect but provide no clear estimate of the magnitude of the effect on sales and profits and the financial risks and rewards.

Sales and profit projections based on computer simulations using the empirical probability of sales from the actual sales data can provide quantitative estimates of the effect on sales and profits, including estimates of the financial risks (chance of losing money) and the financial rewards (typical and best case profits).

(C) 2018 by John F. McGowan, Ph.D.

About Me

John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).

Another Skeptical Look at STEM Shortage Numbers

College STEM Degrees (NSF Science and Engineering Indicators 2018)

It is common to encounter claims of a “desperate” or “severe” shortage of STEM (Science, Technology, Engineering, and Mathematics) workers, either current or projected, usually from employers of STEM workers. These claims are perennial and date back at least to the 1940’s after World War II despite the huge number of STEM workers employed in wartime STEM projects (the Manhattan Project that developed the atomic bomb, military radar, code breaking machines and computers, the B-29 and other high tech bombers, the development of penicillin, K-rations, etc.). This article takes a look at the STEM degree numbers in the National Science Foundation’s Science and Engineering Indicators 2018 report.

College STEM Degrees (NSF Science and Engineering Indicators 2018)
College STEM Degrees (NSF Science and Engineering Indicators 2018)

I looked at the total Science and Engineering bachelors degrees granted each year which includes degrees in Social Science, Psychology, Biological and agricultural sciences as well as hard core Engineering, Computer Science, Mathematics, and Physical Sciences. I also looked specifically at the totals for “hard” STEM degrees (Engineering, Computer Science, Mathematics, and Physical Sciences). I also included the total number of K-12 students who pass (score 3,4, or 5 out of 5) on the Advanced Placement (AP) Calculus Exam (either the AB exam or the more advanced BC exam) each year.

I fitted an exponential growth model to each data series. The exponential growth model fits well to the total STEM degrees and AP passing data. The exponential growth model roughly agrees with the hard STEM degree data, but there is a clear difference, reflected in the coefficient of determination (R-SQUARED) of 0.76 meaning the model explains about 76 percent of the variation in the data.

One can easily see the the number of hard STEM degrees significantly exceeds the trend line in the early 00’s (2000 to about 2004) and drops well below from 2004 to 2008, rebounding in 2008. This probably reflects the surge in CS degrees specifically due to the Internet/dot com bubble (1995-2001).

There appears to be a lag of about four years between the actual dot com crash usually dated to a stock market drop in March of 2000 and the drop in production of STEM bachelor’s degrees in about 2004.

Analysis results:

TOTAL Scientists and Engineers 2016: 6,900,000

ALL STEM Bachelor's Degrees
ESTIMATED TOTAL IN 2016 SINCE 1970: 15,970,052
TOTAL FROM 2001 to 2015 (Science and Engineering Indicators 2018)  7,724,850
ESTIMATED FUTURE STUDENTS (2016 to 2026): 8,758,536
ANNUAL GROWTH RATE:  3.45 %  US POPULATION GROWTH RATE (2016): 0.7 %

HARD STEM DEGREES ONLY (Engineering, Physical Sciences, Math, CS)
ESTIMATED TOTAL IN 2016 SINCE 1970: 5,309,239
TOTAL FROM 2001 to 2015 (Science and Engineering Indicators 2018)  2,429,300
ESTIMATED FUTURE STUDENTS (2016 to 2026): 2,565,802
ANNUAL GROWTH RATE:  2.88 %  US POPULATION GROWTH RATE (2016): 0.7 %

STUDENTS PASSING AP CALCULUS EXAM
ESTIMATED TOTAL IN 2016 SINCE 1970: 5,045,848
TOTAL FROM 2002 to 2016  (College Board)  3,038,279
ESTIMATED FUTURE STUDENTS (2016 to 2026): 4,199,602
ANNUAL GROWTH RATE:  5.53 %  US POPULATION GROWTH RATE (2016): 0.7 %
estimate_college_stem.py ALL DONE

The table below gives the raw numbers from Figure 02-10 in the NSF Science and Engineering Indicators 2018 report with a column for total STEM degrees and a column for total STEM degrees in hard science and technology subjects (Engineering, Computer Science, Mathematics, and Physical Sciences) added for clarity:

STEM Degrees Table fig02-10 Revised
STEM Degrees Table fig02-10 Revised

In the raw numbers, we see steady growth in social science and psychology STEM degrees from 2000 to 2015 with no obvious sign of the Internet/dot com bubble. There is a slight drop in Biological and agricultural sciences degrees in the early 00s. Somewhat larger drops can be seen in Engineering and Physical Sciences degrees in the early 00’s as well as a concomittant sharp rise in Computer Science (CS) degrees. This probably reflects strong STEM students shifting into CS degrees.

The number of K-12 students taking and passing the AP Calculus Exam (either the AB or more advanced BC exam) grows continuously and rapidly during the entire period from 1997 to 2016, growing at over five percent per year, far above the United States population growth rate of 0.7 percent per year.

The number of college students earning hard STEM degrees appears to be slightly smaller than the four year lagged number of K-12 students passing the AP exam, suggesting some attrition of strong STEM students at the college level. We might expect the number of hard STEM bachelors degrees granted each year to be the same or very close to the number of AP Exam passing students four years earlier.

A model using only the hard STEM bachelors degree students gives a total number of STEM college students produced since 1970 of five million, pretty close to the number of K-12 students estimated from the AP Calculus exam data. This is somewhat less than the 6.9 million total employed STEM workers estimated by the United States Bureau of Labor Statistics.

Including all STEM degrees gives a huge surplus of STEM students/workers, most not employed in a STEM field as reported by the US Census and numerous media reports.

The hard STEM degree model predicts about 2.5 million new STEM workers graduating between 2016 and 2026. This is slightly more than the number of STEM job openings seemingly predicted by the Bureau of Labor Statistics (about 800,000 new STEM jobs and about 1.5 million retirements and deaths of current aging STEM workers giving a total of about 2.3 million “new” jobs). The AP student model predicts about 4 million new STEM workers, far exceeding the BLS predictions and most other STEM employment predictions.

The data and models do not include the effects of immigration and guest worker programs such as the controversial H1-B visa, L1 visa, OPT visa, and O (“Genius”) visa. Immigrants and guest workers play an outsized role in the STEM labor force and specifically in the computer science/software labor force (estimated at 3-4 million workers, over half of the STEM labor force).

Difficulty of Evaluating “Soft” STEM Degrees

Social science, psychology, biological and agricultural sciences STEM degrees vary widely in rigor and technical requirements. The pioneering statistician Ronald Fisher developed many of his famous methods as an agricultural researcher at the Rothamsted agricultural research institute. The leading data analysis tool SAS from the SAS Institute was originally developed by agricultural researchers at North Carolina State University. IBM’s SPSS (Statistics Package for Social Sciences) data analysis tool, number three in the market, was developed for social sciences. Many “hard” sciences such as experimental particle physics use methods developed by Fisher and other agricultural and social scientists. Nonetheless, many “soft” science STEM degrees do not involve the same level of quantitative, logical, and programming skills typical of “hard” STEM fields.

In general, STEM degrees at the college level are not highly standardized. There is no national or international standard test or tests comparable to the AP Calculus exams at the K-12 level to get a good national estimate of the number of qualified students.

The numbers suggest but do not prove that most K-12 students who take and pass AP Calculus continue on to hard STEM degrees or some type of rigorous biology or agricultural sciences degree — hence the slight drop in biology and agricultural science degrees during the dot com bubble period with students shifting to CS degrees.

Conclusion

Both the college “hard” STEM degree data and the K-12 AP Calculus exam data strongly suggest that the United States can and will produce more qualified STEM students than job openings predicted for the 2016 to 2026 period. Somewhat more according to the college data, much more according to the AP exam data, and a huge surplus if all STEM degrees including psychology and social science are considered. The data and models do not include the substantial number of immigrants and guest workers in STEM jobs in the United States.

NOTE: The raw data in text CSV (comma separated values) format and the Python analysis program are included in the appendix below.

(C) 2018 by John F. McGowan, Ph.D.

About Me

John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).

Appendix: Source Code and Raw Data

AP Calculus Totals.csv

Year,Total
2016.0,284750.0
2015.0,268316.0
2014.0,264023.0
2013.0,251354.0
2012.0,237184.0
2011.0,211890.0
2010.0,202336.0
2009.0,195667.0
2008.0,191664.0
2007.0,176072.0
2006.0,172396.0
2005.0,151935.0
2004.0,143779.0
2003.0,146996.0
2002.0,139917.0

STEM Degrees with Totals.csv

Year,Social sciences,Biological and agricultural sciences,Psychology,Engineering,Computer sciences,Physical sciences,Mathematics and statistics,Total STEM,Total Hard STEM
2000,113.50,83.13,74.66,59.49,37.52,18.60,11.71,398.61,127.32
2001,114.47,79.48,74.12,59.21,43.60,18.11,11.44,400.43,132.36
2002,119.11,79.03,77.30,60.61,49.71,17.98,12.25,415.99,140.55
2003,129.74,81.22,79.16,63.79,57.93,18.06,12.86,442.76,152.64
2004,137.74,81.81,82.61,64.68,59.97,18.12,13.74,458.67,156.51
2005,144.57,85.09,86.03,66.15,54.59,18.96,14.82,470.21,154.52
2006,148.11,90.28,88.55,68.23,48.00,20.38,15.31,478.86,151.92
2007,150.73,97.04,90.50,68.27,42.60,21.08,15.55,485.77,147.50
2008,155.67,100.87,92.99,69.91,38.92,21.97,15.84,496.17,146.64
2009,158.18,104.73,94.74,70.60,38.50,22.48,16.21,505.44,147.79
2010,163.07,110.02,97.75,74.40,40.11,23.20,16.83,525.38,154.54
2011,172.18,116.41,101.57,78.10,43.59,24.50,18.02,554.37,164.21
2012,177.33,124.96,109.72,83.26,47.96,26.29,19.81,589.33,177.32
2013,179.26,132.31,115.37,87.81,51.59,27.57,21.57,615.48,188.54
2014,177.94,138.32,118.40,93.95,56.13,28.95,22.23,635.92,201.26
2015,173.72,144.58,118.77,99.91,60.31,29.64,23.14,650.07,213.00

estimate_college_stem.py

#
#  Estimate the total production of STEM students at the
#  College level from BS degrees granted (United States)
#
#  (C) 2018 by John F. McGowan, Ph.D. (ceo@mathematical-software.com)
#

# Python standard libraries
import os
import sys
import time

# Numerical/Scientific Python libraries
import numpy as np
import scipy.optimize as opt  # curve_fit()
import pandas as pd  # reading text CSV files etc.

# Graphics
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
from mpl_toolkits.mplot3d import Axes3D

# customize fonts
SMALL_SIZE = 8
MEDIUM_SIZE = 10
LARGE_SIZE = 12
XL_SIZE = 14
XXL_SIZE = 16

plt.rc('font', size=XL_SIZE)          # controls default text sizes
plt.rc('axes', titlesize=XL_SIZE)     # fontsize of the axes title
plt.rc('axes', labelsize=XL_SIZE)     # fontsize of the x and y labels
plt.rc('xtick', labelsize=XL_SIZE)    # fontsize of the tick labels
plt.rc('ytick', labelsize=XL_SIZE)    # fontsize of the tick labels
plt.rc('legend', fontsize=XL_SIZE)    # legend fontsize
plt.rc('figure', titlesize=XL_SIZE)   # fontsize of the figure title

# STEM Bachelors Degrees earned by year (about 2000 to 2015)
#
# data from National Science Foundation (NSF)/ National Science Board
# Science and Engineering Indicators 2018 Report
# https://www.nsf.gov/statistics/2018/nsb20181/
# Figure 02-10
#
input_file = "STEM Degrees with Totals.csv"

if len(sys.argv) > 1:
    index = 1
    while index < len(sys.argv):
        if sys.argv[index] in ["-i", "-input"]:
            input_file = sys.argv[index+1]
            index += 1
        elif sys.argv[index] in ["-h", "--help", "-help", "-?"]:
            print("Usage:", sys.argv[0], " -i input_file='AP Calculus Totals by Year.csv'")
            sys.exit(0)
        index +=1

print(__file__, "started", time.ctime())  # time stamp
print("Processing data from: ", input_file)

# read text CSV file (exported from spreadsheet)
df = pd.read_csv(input_file)

# drop NaNs for missing values in Pandas
df.dropna()

# get number of students who pass AP Calculus Exam (AB or BC)
# each year
df_ap_pass = pd.read_csv("AP Calculus Totals.csv")
ap_year = df_ap_pass.values[:,0]
ap_total = df_ap_pass.values[:,1] 

# numerical data
hard_stem_str = df.values[1:,-1] # engineering, physical sciences, math/stat, CS
all_stem_str = df.values[1:,-2]  # includes social science, psychology, agriculture etc.

hard_stem = np.zeros(hard_stem_str.shape)
all_stem = np.zeros(all_stem_str.shape)

for index, val in enumerate(hard_stem_str.ravel()):
    if isinstance(val, str):
        hard_stem[index] = np.float(val.replace(',',''))
    elif isinstance(val, (float, np.float)):
        hard_stem[index] = val
    else:
        raise TypeError("unsupported type " + str(type(val)))

for index, val in enumerate(all_stem_str.ravel()):
    if isinstance(val, str):
        all_stem[index] = np.float(val.replace(',', ''))
    elif isinstance(val, (float, np.float)):
        all_stem[index] = val
    else:
        raise TypeError("unsupported type " + str(type(val)))

DEGREES_PER_UNIT = 1000
# units are thousands of degrees granted 
all_stem = DEGREES_PER_UNIT*all_stem
hard_stem = DEGREES_PER_UNIT*hard_stem
    
years_str = df.values[1:,0]
years = np.zeros(years_str.shape)
for index, val in enumerate(years_str.ravel()):
    years[index] = np.float(val)

# almost everyone in the labor force graduated since 1970
# someone 18 years old in 1970 is 66 today (2018)
START_YEAR = 1970

def my_exp(x, *p):
    """
    exponential model for curve_fit(...)
    """
    return p[0]*np.exp(p[1]*(x - START_YEAR))

# starting guess for model parameters
p_start = [ 50000.0, 0.01 ]

# fit all STEM degree data
popt, pcov = opt.curve_fit(my_exp, years, all_stem, p_start)

# fit hard STEM degree data
popt_hard_stem, pcov_hard_stem = opt.curve_fit(my_exp, \
                                               years, \
                                               hard_stem, \
                                               p_start)
# fit AP Students data
popt_ap, pcov_ap = opt.curve_fit(my_exp, \
                                 ap_year, \
                                 ap_total, \
                                 p_start)

print(popt)  # sanity check

STOP_YEAR = 2016
NYEARS = (STOP_YEAR - START_YEAR + 1)

years_fit = np.linspace(START_YEAR, STOP_YEAR, NYEARS)
n_fit = my_exp(years_fit, *popt)

n_pred = my_exp(years, *popt)

r2 = 1.0 - (n_pred - all_stem).var()/all_stem.var()
r2_str = "%4.3f" % r2

n_fit_hard = my_exp(years_fit, *popt_hard_stem)
n_pred_hard = my_exp(years, *popt_hard_stem)

r2_hard = 1.0 - (n_pred_hard - hard_stem).var()/hard_stem.var()
r2_hard_str = "%4.3f" % r2_hard

n_fit_ap = my_exp(years_fit, *popt_ap)
n_pred_ap = my_exp(ap_year, *popt_ap)

r2_ap = 1.0 - (n_pred_ap - ap_total).var()/ap_total.var()
r2_ap_str = "%4.3f" % r2_ap


cum_all_stem = n_fit.sum()
cum_hard_stem = n_fit_hard.sum()
cum_ap_stem = n_fit_ap.sum()

# to match BLS projections
future_years = np.linspace(2016, 2026, 11)

assert future_years.size == 11  # sanity check

future_students = my_exp(future_years, *popt)
future_students_hard = my_exp(future_years, *popt_hard_stem)
future_students_ap = my_exp(future_years, *popt_ap)

# https://fas.org/sgp/crs/misc/R43061.pdf
#
# The U.S. Science and Engineering Workforce: Recent, Current,
# and Projected Employment, Wages, and Unemployment
#
# by John F. Sargent Jr.
# Specialist in Science and Technology Policy
# November 2, 2017
#
# Congressional Research Service 7-5700 www.crs.gov R43061
#
# "In 2016, there were 6.9 million scientists and engineers (as
# defined in this report) employed in the United States, accounting
# for 4.9 % of total U.S. employment."
#

# BLS astonishing/bizarre projections for 2016-2026

# "The Bureau of Labor Statistics (BLS) projects that the number of S&E
# jobs will grow by 853,600 between 2016 and 2026 , a growth rate
# (1.1 % CAGR) that is somewhat faster than that of the overall
# workforce ( 0.7 %). In addition, BLS projects that 5.179 million
# scientists and engineers will be needed due to labor force exits and
# occupational transfers (referred to collectively as occupational
# separations ). BLS projects the total number of openings in S&E due to growth ,
# labor force exits, and occupational transfers between 2016 and 2026 to be
# 6.033 million, including 3.477 million in the computer occupations and
# 1.265 million in the engineering occupations."

# NOTE: This appears to project 5.170/6.9 or 75 percent!!!! of current STEM
# labor force LEAVE THE STEM PROFESSIONS by 2026!!!!

# "{:,}".format(value) to specify the comma separated thousands format
#
print("TOTAL Scientists and Engineers 2016:", "{:,.0f}".format(6.9e6))
# ALL STEM
print("\nALL STEM Bachelor's Degrees")
print("ESTIMATED TOTAL IN 2016 SINCE ", START_YEAR, ": ", \
      "{:,.0f}".format(cum_all_stem), sep='')
# don't use comma grouping for years
print("TOTAL FROM", "{:.0f}".format(years_str[0]), \
      "to 2015 (Science and Engineering Indicators 2018) ", \
      "{:,.0f}".format(all_stem.sum()))
print("ESTIMATED FUTURE STUDENTS (2016 to 2026):", \
      "{:,.0f}".format(future_students.sum()))
# annual growth rate of students taking AP Calculus
growth_rate_pct = (np.exp(popt[1]) - 1.0)*100

print("ANNUAL GROWTH RATE: ", "{:,.2f}".format(growth_rate_pct), \
      "%  US POPULATION GROWTH RATE (2016): 0.7 %")

# HARD STEM

print("\nHARD STEM DEGREES ONLY (Engineering, Physical Sciences, Math, CS)")
print("ESTIMATED TOTAL IN 2016 SINCE ", START_YEAR, ": ", \
      "{:,.0f}".format(cum_hard_stem), sep='')
# don't use comma grouping for years
print("TOTAL FROM", "{:.0f}".format(years_str[0]), \
      "to 2015 (Science and Engineering Indicators 2018) ", \
      "{:,.0f}".format(hard_stem.sum()))
print("ESTIMATED FUTURE STUDENTS (2016 to 2026):", \
      "{:,.0f}".format(future_students_hard.sum()))
# annual growth rate of students taking AP Calculus
growth_rate_pct_hard = (np.exp(popt_hard_stem[1]) - 1.0)*100

print("ANNUAL GROWTH RATE: ", "{:,.2f}".format(growth_rate_pct_hard), \
      "%  US POPULATION GROWTH RATE (2016): 0.7 %")


# AP STEM -- Students passing AP Calculus Exam Each Year 

print("\nSTUDENTS PASSING AP CALCULUS EXAM")
print("ESTIMATED TOTAL IN 2016 SINCE ", START_YEAR, ": ", \
      "{:,.0f}".format(cum_ap_stem), sep='')
# don't use comma grouping for years
print("TOTAL FROM", "{:.0f}".format(ap_year[-1]), \
      "to", "{:.0f}".format(ap_year[0])," (College Board) ", \
      "{:,.0f}".format(ap_total.sum()))
print("ESTIMATED FUTURE STUDENTS (2016 to 2026):", \
      "{:,.0f}".format(future_students_ap.sum()))
# annual growth rate of students taking AP Calculus
growth_rate_pct_ap = (np.exp(popt_ap[1]) - 1.0)*100

print("ANNUAL GROWTH RATE: ", "{:,.2f}".format(growth_rate_pct_ap), \
      "%  US POPULATION GROWTH RATE (2016): 0.7 %")


# US Census reports 0.7 percent annual growth of US population in 2016
# SOURCE: https://www.census.gov/newsroom/press-releases/2016/cb16-214.html
#

f1 = plt.figure(figsize=(12,9))
ax = plt.gca()
# add commas to tick values (e.g. 1,000 instead of 1000)
ax.get_yaxis().set_major_formatter(
    ticker.FuncFormatter(lambda x, p: format(int(x), ',')))

DOT_COM_CRASH = 2000.25  # usually dated march 10, 2000
OCT_2008_CRASH = 2008.75 # usually dated October 11, 2008
DELTA_LABEL_YEARS = 0.5

plt.plot(years_fit, n_fit, 'g', linewidth=3, label='ALL STEM FIT')
plt.plot(years, all_stem, 'bs', markersize=10, label='ALL STEM DATA')
plt.plot(years_fit, n_fit_hard, 'r', linewidth=3, label='HARD STEM FIT')
plt.plot(years, hard_stem, 'ms', markersize=10, label='HARD STEM DATA')
plt.plot(years_fit, n_fit_ap, 'k', linewidth=3, label='AP STEM FIT')
plt.plot(ap_year, ap_total, 'cd', markersize=10, label='AP STEM DATA')
[ylow, yhigh] = plt.ylim()
dy = yhigh - ylow
# add marker lines for crashes
plt.plot((DOT_COM_CRASH, DOT_COM_CRASH), (ylow+0.1*dy, yhigh), 'b-')
plt.text(DOT_COM_CRASH + DELTA_LABEL_YEARS, 0.9*yhigh, '<-- DOT COM CRASH')
# plt.arrow(...) add arrow (arrow does not render correctly)

plt.plot((OCT_2008_CRASH, OCT_2008_CRASH), (ylow+0.1*dy, 0.8*yhigh), 'b-')
plt.text(OCT_2008_CRASH+DELTA_LABEL_YEARS, 0.5*yhigh, '<-- 2008 CRASH')
plt.legend()
plt.title('STUDENTS STEM BACHELORS DEGREES (ALL R**2=' \
          + r2_str + ',  HARD R**2=' + r2_hard_str + \
          ', AP R**2=' + r2_ap_str + ')')
plt.xlabel('YEAR')
plt.ylabel('TOTAL STEM BS DEGREES')
# appear to need to do this after the plots
# to get valid ranges
[xlow, xhigh] = plt.xlim()
[ylow, yhigh] = plt.ylim()
dx = xhigh - xlow
dy = yhigh - ylow
# put input data file name in lower right corner
plt.text(xlow + 0.65*dx, \
         ylow + 0.05*dy, \
         input_file, \
         bbox=dict(facecolor='red', alpha=0.2))

plt.show()

f1.savefig('College_STEM_Degrees.jpg')

print(__file__, "ALL DONE")

The Mathematics Recognition Problem

 

A brief introduction to the math recognition problem and automatic math recognition using modern artificial intelligence and pattern recognition methods. Includes a call for data.  About 14 minutes.

(C) 2018 by John F. McGowan, Ph.D.

About Me

John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).

The Perils of Particle Physics

Lost in Math
Lost in Math

Sabine Hossenfelder’s Lost in Math: How Beauty Leads Physics Astray (Basic Books, June 2018) is a critical account of the disappointing progress in fundamental physics, primarily particle physics and cosmology, since the formulation of the “standard model” in the 1970’s.  It focuses on the failure to find new physics at CERN’s $13.25 billion Large Hadron Collider (LHC) and many questionable predictions that super-symmetric particles, hidden dimensions, or other exotica beloved of theoretical particle physicists would be found at LHC when it finally turned on.  In many ways, this lack of progress in fundamental physics parallels and perhaps underlies the poor progress in power and propulsion technologies since the 1970s.

Lost in Math joins a small but growing collection of popular and semi-popular books and personal accounts critical of particle physics including David Lindley’s 1994 The End of Physics: The Myth of a Unified Theory, Lee Smolin’s The Trouble with Physics: The Rise of String Theory, the Fall of Science and What Comes Next, and Peter Woit’s Not Even Wrong: The Failure of String Theory and the Search for Unity in Physical Law.  It shares many points in common with these earlier books. Indeed, Peter Woit is quoted on the back cover and Lee Smolin is listed in the acknowledgements as a volunteer who read drafts of the manuscript.  Anyone considering prolonged involvement, e.g. graduate school, or a career in particle physics should read Lost in Math as well as these earlier books.

The main premise of Lost in Math is that theoretical particle physicists like the author have been lead astray by an unscientific obsession with mathematical “beauty” in selecting and also refusing to abandon theories, notably super-symmetry (usually abbreviated as SUSY in popular physics writing), despite an embarrassing lack of evidence.  The author groups together several different issues under the rubric of “beauty” including the use of the terms beauty and elegance by theoretical physicists, at least two kinds of “naturalness,” the “fine tuning” of the constants in a theory to make it consistent with life, the desire for simplicity, dissatisfaction with the complexity of the standard model (twenty-five “fundamental” particles and a complex Lagrangian that fills two pages of fine print in a physics textbook), doubts about renormalization — an ad hoc procedure for removing otherwise troubling infinities — in Quantum Field Theory (QFT), and questions about “measurement” in quantum mechanics.  Although I agree with many points in the book, I feel the blanket attack on “beauty” is too broad, conflating several different issues, and misses the mark.

In Defense of “Beauty”

As the saying goes, beauty is in the eye of the beholder.  The case for simplicity or more accurately falsifiability in mathematical models is on a sounder, more objective basis than beauty however.  In many cases a complex model with many terms and adjustable parameters can fit many different data sets.  Some models are highly plastic.  They can fit almost any data set not unlike the way saran wrap can fit almost any surface.  These models are wholly unfalsifiable.

A mathematical model which can match any data set cannot be disproven.  It is not falsifiable.  A theory that predicts everything, predicts nothing.

Some models are somewhat plastic, able to fit many but not all data sets, not unlike a rubber sheet.  They are hard to falsify — somewhat unfalsifiable.  Some models are quite rigid, like a solid piece of stone fitting into another surface.  These models are fully falsifiable.

A simple well known example of this problem is a polynomial with many terms.  A polynomial with enough terms can match any data set.  In general, the fitted model will fail to extrapolate, to predict data points outside the domain of the data set used in the model fitting (the training set in the terminology of neural networks for example).  The fitted polynomial model will frequently interpolate, predict data points within the domain of the data set used in the model fitting — points near and in-between the training set data points, correctly.  Thus, we can say that a polynomial model with enough terms is not falsifiable in the sense of the philosopher of science Karl Popper because it can fit many data sets, not just the data set we actually have (real data).

This problem with complex mathematical models was probably first encountered with models of planetary motion in antiquity, the infamous epicycles of Ptolemy and his predecessors in ancient Greece and probably Babylonia/Sumeria (modern Iraq).  Pythagoras visited both Babylonia and Egypt.  The early Greek accounts of his life suggest he brought back the early Greek math and astronomy from Babylonia and Egypt.

Early astronomers, probably first in Babylonia, attempted to model the motion of Mars and other planets through the Zodiac as uniform circular motion around a stationary Earth.  This was grossly incorrect in the case of Mars which backs up for about two months about every two years.  Thus the early astronomers introduced an epicycle for Mars. They speculated that Mars moved in uniform circular motion around a point that in turn moved in uniform circular motion around the Earth.  With a single epicycle they could reproduce the biannual backing up with some errors.  To achieve greater accuracy, they added more and more epicycles, producing an ever more complex model that had some predictive power.  Indeed the state of the art Ptolemaic model in the sixteenth century was better than Copernicus’ new heliocentric model which also relied on uniform circular motion and epicycles.

The Ptolemaic model of planetary motion is difficult to falsify because one can keep adding more epicycles to account for discrepancies between the theory and observation.  It also has some predictive power.  It is an example of a “rubber sheet” model, not a “saran wrap” model.

In the real world, falsifiability is not a simple binary criterion.  A mathematical model is not either falsifiable and therefore good or not falsifiable and therefore bad.  Rather falsifiability falls on a continuum.  In general, extremely complex theories are hard to falsify and not predictive outside of the domain of the data used to infer (fit) the complex theory.  Simpler theories tend to be easier to falsify and if correct are sometimes very predictive as with Kepler’s Laws of Planetary Motion and subsequently Newton’s Law of Gravitation, from which Kepler’s Laws can be derived.

Unfortunately, this experience with mathematical modeling is known but has not been quantified in a rigorous way by mathematicians and scientists.  Falsifiabiliy remains a slogan primarily used against creationists, parapsychologists, and other groups rather than a rigorous criterion to evaluate theories like the standard model, supersymmetry, or superstrings.

A worrying concern with the standard model with its twenty-five fundamental particles, complex two-page Lagrangian (mathematical formula), and seemingly ad hoc elements such as the Higgs particle and Kobayashi-Maskawa matrix is that it is matching real data entirely or in part due to its complexity and inherent plasticity, much like the historical epicycles or a polynomial with many terms.   This concern is not just about subjective “beauty.”

Sheldon Glashow’s original formulation of what became the modern standard model was much simpler, did not include the Higgs particle, did not include the charm, top, or bottom quarks, and a number of other elements (S.L. Glashow (1961). “Partial-symmetries of weak interactions”. Nuclear Physics. 22 (4): 579–588. ).  Much as epicycles were added to the early theories of planetary motion, these elements were added on during the 1960’s and 1970’s to achieve agreement with experimental results and theoretical prejudices.  In evaluating the seeming success and falsifiability of the standard model, we need to consider not only the terms that were added over the decades but also the terms that might have been added to salvage the theory.

Theories with symmetry have fewer adjustable parameters and are less plastic, flexible, less able to match the data regardless of what data is presented.  This forms an objective but poorly quantified basis for intuitive notions of the “mathematical beauty” of symmetry in physics and other fields.

The problem is that although we can express this known problem of poor falsifiability or plasticity (at the most extreme an ability to fit any data set)  with mathematical models and modeling qualitatively with words such as “beauty” or “symmetry” or “simplicity,” we cannot express it in rigorous quantitative terms yet.

Big Science and Big Bucks

Much of the book concerns the way the Large Hadron Collider and its huge budget warped the thinking and research results of theoretical physicists, rewarding some like Nima Arkani-Hamed who could produce catchy arguments that new physics would be found at the LHC and encouraging many more to produce questionable arguments that super-symmetry, hidden dimensions or other glamorous exotica would be discovered.   The author recounts how her Ph.D. thesis supervisor redirected her research to a topic “Black Holes in Large Extra Dimensions” (2003) that would support the LHC.

Particle accelerators and other particle physics experiments have a long history of huge cost and schedule overruns — which are generally omitted or glossed over in popular and semi-popular accounts.  The not-so-funny joke that I learned in graduate school was “multiply the schedule by pi (3.14)” to get the real schedule.  A variant was “multiply the schedule by pi for running around in a circle.”  Time is money and the huge delays usually mean huge cost overruns.  Often these have involved problems with the magnets in the accelerators.

The LHC was no exception to this historical pattern.  It went substantially over budget and schedule before its first turn on in 2008, when around a third of the magnets in the multi-billion accelerator exploded, forcing expensive and time consuming repairs (see CERN’s whitewash of the disaster here).  LHC faced significant criticism over the cost overruns in Europe even before the 2008 magnet explosion.  The reported discovery of the Higgs boson in 2012 has substantially blunted the criticism; one could argue LHC had to make a discovery.  🙂

The cost and schedule overruns have contributed to the cancellation of several accelerator projects including ISABELLE at the Brookhaven National Laboratory on Long Island and the Superconducting Super Collider (SSC) in Texas.  The particle physics projects must compete with much bigger, more politically connected, and more popular programs.

The frequent cost and schedule overruns mean that pursuing a Ph.D. in experimental particle physics often takes much longer than advertised and is often quite disappointing as happened to large numbers of LHC graduate students.  For theorists, the pressure to provide a justification for the multi-billion dollar projects is undoubtedly substantial.

While genuine advances in fundamental physics may ultimately produce new energy technologies or other advances that will benefit humanity greatly, the billions spent on particle accelerators and other big physics experiments are certain, here and now.  The aging faculty at universities and senior scientists at the few research labs like CERN who largely control the direction of particle physics cannot easily retrain for new fields unlike disappointed graduate students or post docs in their twenties and early thirties.  The hot new fields like computers and hot high tech employers such as Google are noted for their preference for twenty-somethings and hostility to employees even in their thirties.  The existing energy industry seems remarkably unconcerned about alleged “peak oil” or climate change and empirically invests little if anything in finding replacement technologies.

Is there a way forward?

Sabine, who writes on her blog that she is probably leaving particle physics soon, offers some suggestions to improve the field, primarily focusing on learning about and avoiding cognitive biases.  This reminds me a bit of the unconscious bias training that Google and other Silicon Valley companies have embraced in a purported attempt to fix their seeming avoidance of employees from certain groups — with dismal results so far.  Responding rationally if perhaps unethically to clear economic rewards is not a cognitive bias and almost certainly won’t respond to cognitive bias training.  If I learn that I am unconsciously doing something because it is in my economic interest to do so, will I stop?

Future progress in fundamental physics probably depends on finding new informative data that does not cost billions of dollars (for example, a renaissance of table top experiments), reanalysis of existing data, and improved methods of data analysis such as putting falsifiability on a rigorous quantitative basis.

(C) 2018 by John F. McGowan, Ph.D.

About Me

John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).