John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).
This is a short video about why erroneous STEM (Science, Technology, Engineering, and Mathematics) worker shortage claims seem plausible in the United States.
Links:
A Skeptical Look at STEM Shortage Numbers: https://wordpress.jmcgowan.com/wp/a-skeptical-look-at-stem-shortage-numbers/
Code.org What Most Schools Don’t Teach Video: https://youtu.be/nKIu9yen5nc
Political Polarization in the United States: https://www.pewresearch.org/topics/political-polarization/
Support Us: PATREON: https://www.patreon.com/user?u=28764298
(C) 2020 by John F. McGowan, Ph.D.
About Me
John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).
John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).
This is an interview that I did on Martian methane, possible Martian life, and my work on oil and natural gas on Mars while at NASA Ames Research Center — conducted by John Michael Godier on his Event Horizon show.
(C) 2019 by John F. McGowan, Ph.D.
About Me
John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).
John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).
Remarkably, this article prominently claims a shortage of STEM workers in the United States, citing a study by the National Association of Manufacturers (NAM) and the Deloitte accounting firm claiming that employers will need to fill 3.5 million STEM jobs by 2025, with more than 2 million of them going unfilled because of the lack of highly skilled candidates in demand, while also stating:
Higher barriers to H-1B visa access is compounding the STEM shortage: there are low numbers of U.S. STEM field graduates coupled with decreasing foreign STEM talent to mitigate the supply shortage. Forbes reportsin 2016 that there were 568,000 STEM graduates in the U.S., compared to 2.6 million in India and 4.7 million in China.
Emphasis Added
Note that an annual rate of production of 568,000 STEM graduates in the United States multiplied by the seven years between 2018 (the date of the article) and the 2025 date of the NAM/Deloitte projection gives over 3.9 million STEM graduates, substantially more than the NAM projection of 3.5 million jobs to be filled. Thus:
What STEM Shortage?
In fact according to the US Census about half of all US college graduates with STEM degrees are not working in STEM professions despite pervasive claims of a desperate or severe shortage of STEM graduates by STEM employers and others! (For a more in depth discussion of STEM shortage numbers see my recent article “A Skeptical Look at STEM Shortage Numbers“)
Note that the Recruiting Today article, repeating a common theme in STEM shortage claims, attributes the non-existent STEM shortage to a lack of interest in STEM fields by pre-teen and teen K-12 students in the United States, implicitly absolving colleges and universities (or STEM employers) of any responsibility for the alleged STEM shortage. At the same time it actually cites a number of annual STEM graduates that grossly contradicts its assertion of lack of interest in STEM fields and its central claim of a STEM shortage at all.
Neither the article’s author or presumably editor at Recruiting Daily nor Google nor Google’s vaunted ranking algorithm seems to have noticed this astonishing contradiction.
Why is an article on “STEM shortage” with such an extreme (and unexplained) internal inconsistency ranked number oneon Google?
(C) 2019 by John F. McGowan, Ph.D.
About Me
John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).
It is common to encounter claims of a “desperate” or “severe” shortage of STEM (Science, Technology, Engineering, and Mathematics) workers, either current or projected, usually from employers of STEM workers. These claims are perennial and date back at least to the 1940’s after World War II despite the huge number of STEM workers employed in wartime STEM projects (the Manhattan Project that developed the atomic bomb, military radar, code breaking machines and computers, the B-29 and other high tech bombers, the development of penicillin, K-rations, etc.). This article takes a look at the STEM degree numbers in the National Science Foundation’s Science and Engineering Indicators 2018 report.
I looked at the total Science and Engineering bachelors degrees granted each year which includes degrees in Social Science, Psychology, Biological and agricultural sciences as well as hard core Engineering, Computer Science, Mathematics, and Physical Sciences. I also looked specifically at the totals for “hard” STEM degrees (Engineering, Computer Science, Mathematics, and Physical Sciences). I also included the total number of K-12 students who pass (score 3,4, or 5 out of 5) on the Advanced Placement (AP) Calculus Exam (either the AB exam or the more advanced BC exam) each year.
I fitted an exponential growth model to each data series. The exponential growth model fits well to the total STEM degrees and AP passing data. The exponential growth model roughly agrees with the hard STEM degree data, but there is a clear difference, reflected in the coefficient of determination (R-SQUARED) of 0.76 meaning the model explains about 76 percent of the variation in the data.
One can easily see the the number of hard STEM degrees significantly exceeds the trend line in the early 00’s (2000 to about 2004) and drops well below from 2004 to 2008, rebounding in 2008. This probably reflects the surge in CS degrees specifically due to the Internet/dot com bubble (1995-2001).
There appears to be a lag of about four years between the actual dot com crash usually dated to a stock market drop in March of 2000 and the drop in production of STEM bachelor’s degrees in about 2004.
Analysis results:
TOTAL Scientists and Engineers 2016: 6,900,000
ALL STEM Bachelor's Degrees
ESTIMATED TOTAL IN 2016 SINCE 1970: 15,970,052
TOTAL FROM 2001 to 2015 (Science and Engineering Indicators 2018) 7,724,850
ESTIMATED FUTURE STUDENTS (2016 to 2026): 8,758,536
ANNUAL GROWTH RATE: 3.45 % US POPULATION GROWTH RATE (2016): 0.7 %
HARD STEM DEGREES ONLY (Engineering, Physical Sciences, Math, CS)
ESTIMATED TOTAL IN 2016 SINCE 1970: 5,309,239
TOTAL FROM 2001 to 2015 (Science and Engineering Indicators 2018) 2,429,300
ESTIMATED FUTURE STUDENTS (2016 to 2026): 2,565,802
ANNUAL GROWTH RATE: 2.88 % US POPULATION GROWTH RATE (2016): 0.7 %
STUDENTS PASSING AP CALCULUS EXAM
ESTIMATED TOTAL IN 2016 SINCE 1970: 5,045,848
TOTAL FROM 2002 to 2016 (College Board) 3,038,279
ESTIMATED FUTURE STUDENTS (2016 to 2026): 4,199,602
ANNUAL GROWTH RATE: 5.53 % US POPULATION GROWTH RATE (2016): 0.7 %
estimate_college_stem.py ALL DONE
The table below gives the raw numbers from Figure 02-10 in the NSF Science and Engineering Indicators 2018 report with a column for total STEM degrees and a column for total STEM degrees in hard science and technology subjects (Engineering, Computer Science, Mathematics, and Physical Sciences) added for clarity:
In the raw numbers, we see steady growth in social science and psychology STEM degrees from 2000 to 2015 with no obvious sign of the Internet/dot com bubble. There is a slight drop in Biological and agricultural sciences degrees in the early 00s. Somewhat larger drops can be seen in Engineering and Physical Sciences degrees in the early 00’s as well as a concomittant sharp rise in Computer Science (CS) degrees. This probably reflects strong STEM students shifting into CS degrees.
The number of K-12 students taking and passing the AP Calculus Exam (either the AB or more advanced BC exam) grows continuously and rapidly during the entire period from 1997 to 2016, growing at over five percent per year, far above the United States population growth rate of 0.7 percent per year.
The number of college students earning hard STEM degrees appears to be slightly smaller than the four year lagged number of K-12 students passing the AP exam, suggesting some attrition of strong STEM students at the college level. We might expect the number of hard STEM bachelors degrees granted each year to be the same or very close to the number of AP Exam passing students four years earlier.
A model using only the hard STEM bachelors degree students gives a total number of STEM college students produced since 1970 of five million, pretty close to the number of K-12 students estimated from the AP Calculus exam data. This is somewhat less than the 6.9 million total employed STEM workers estimated by the United States Bureau of Labor Statistics.
Including all STEM degrees gives a huge surplus of STEM students/workers, most not employed in a STEM field as reported by the US Census and numerous media reports.
The hard STEM degree model predicts about 2.5 million new STEM workers graduating between 2016 and 2026. This is slightly more than the number of STEM job openings seemingly predicted by the Bureau of Labor Statistics (about 800,000 new STEM jobs and about 1.5 million retirements and deaths of current aging STEM workers giving a total of about 2.3 million “new” jobs). The AP student model predicts about 4 million new STEM workers, far exceeding the BLS predictions and most other STEM employment predictions.
The data and models do not include the effects of immigration and guest worker programs such as the controversial H1-B visa, L1 visa, OPT visa, and O (“Genius”) visa. Immigrants and guest workers play an outsized role in the STEM labor force and specifically in the computer science/software labor force (estimated at 3-4 million workers, over half of the STEM labor force).
Difficulty of Evaluating “Soft” STEM Degrees
Social science, psychology, biological and agricultural sciences STEM degrees vary widely in rigor and technical requirements. The pioneering statistician Ronald Fisher developed many of his famous methods as an agricultural researcher at the Rothamsted agricultural research institute. The leading data analysis tool SAS from the SAS Institute was originally developed by agricultural researchers at North Carolina State University. IBM’s SPSS (Statistics Package for Social Sciences) data analysis tool, number three in the market, was developed for social sciences. Many “hard” sciences such as experimental particle physics use methods developed by Fisher and other agricultural and social scientists. Nonetheless, many “soft” science STEM degrees do not involve the same level of quantitative, logical, and programming skills typical of “hard” STEM fields.
In general, STEM degrees at the college level are not highly standardized. There is no national or international standard test or tests comparable to the AP Calculus exams at the K-12 level to get a good national estimate of the number of qualified students.
The numbers suggest but do not prove that most K-12 students who take and pass AP Calculus continue on to hard STEM degrees or some type of rigorous biology or agricultural sciences degree — hence the slight drop in biology and agricultural science degrees during the dot com bubble period with students shifting to CS degrees.
Conclusion
Both the college “hard” STEM degree data and the K-12 AP Calculus exam data strongly suggest that the United States can and will produce more qualified STEM students than job openings predicted for the 2016 to 2026 period. Somewhat more according to the college data, much more according to the AP exam data, and a huge surplus if all STEM degrees including psychology and social science are considered. The data and models do not include the substantial number of immigrants and guest workers in STEM jobs in the United States.
NOTE: The raw data in text CSV (comma separated values) format and the Python analysis program are included in the appendix below.
(C) 2018 by John F. McGowan, Ph.D.
About Me
John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).
Year,Social sciences,Biological and agricultural sciences,Psychology,Engineering,Computer sciences,Physical sciences,Mathematics and statistics,Total STEM,Total Hard STEM
2000,113.50,83.13,74.66,59.49,37.52,18.60,11.71,398.61,127.32
2001,114.47,79.48,74.12,59.21,43.60,18.11,11.44,400.43,132.36
2002,119.11,79.03,77.30,60.61,49.71,17.98,12.25,415.99,140.55
2003,129.74,81.22,79.16,63.79,57.93,18.06,12.86,442.76,152.64
2004,137.74,81.81,82.61,64.68,59.97,18.12,13.74,458.67,156.51
2005,144.57,85.09,86.03,66.15,54.59,18.96,14.82,470.21,154.52
2006,148.11,90.28,88.55,68.23,48.00,20.38,15.31,478.86,151.92
2007,150.73,97.04,90.50,68.27,42.60,21.08,15.55,485.77,147.50
2008,155.67,100.87,92.99,69.91,38.92,21.97,15.84,496.17,146.64
2009,158.18,104.73,94.74,70.60,38.50,22.48,16.21,505.44,147.79
2010,163.07,110.02,97.75,74.40,40.11,23.20,16.83,525.38,154.54
2011,172.18,116.41,101.57,78.10,43.59,24.50,18.02,554.37,164.21
2012,177.33,124.96,109.72,83.26,47.96,26.29,19.81,589.33,177.32
2013,179.26,132.31,115.37,87.81,51.59,27.57,21.57,615.48,188.54
2014,177.94,138.32,118.40,93.95,56.13,28.95,22.23,635.92,201.26
2015,173.72,144.58,118.77,99.91,60.31,29.64,23.14,650.07,213.00
estimate_college_stem.py
#
# Estimate the total production of STEM students at the
# College level from BS degrees granted (United States)
#
# (C) 2018 by John F. McGowan, Ph.D. (ceo@mathematical-software.com)
#
# Python standard libraries
import os
import sys
import time
# Numerical/Scientific Python libraries
import numpy as np
import scipy.optimize as opt # curve_fit()
import pandas as pd # reading text CSV files etc.
# Graphics
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
from mpl_toolkits.mplot3d import Axes3D
# customize fonts
SMALL_SIZE = 8
MEDIUM_SIZE = 10
LARGE_SIZE = 12
XL_SIZE = 14
XXL_SIZE = 16
plt.rc('font', size=XL_SIZE) # controls default text sizes
plt.rc('axes', titlesize=XL_SIZE) # fontsize of the axes title
plt.rc('axes', labelsize=XL_SIZE) # fontsize of the x and y labels
plt.rc('xtick', labelsize=XL_SIZE) # fontsize of the tick labels
plt.rc('ytick', labelsize=XL_SIZE) # fontsize of the tick labels
plt.rc('legend', fontsize=XL_SIZE) # legend fontsize
plt.rc('figure', titlesize=XL_SIZE) # fontsize of the figure title
# STEM Bachelors Degrees earned by year (about 2000 to 2015)
#
# data from National Science Foundation (NSF)/ National Science Board
# Science and Engineering Indicators 2018 Report
# https://www.nsf.gov/statistics/2018/nsb20181/
# Figure 02-10
#
input_file = "STEM Degrees with Totals.csv"
if len(sys.argv) > 1:
index = 1
while index < len(sys.argv):
if sys.argv[index] in ["-i", "-input"]:
input_file = sys.argv[index+1]
index += 1
elif sys.argv[index] in ["-h", "--help", "-help", "-?"]:
print("Usage:", sys.argv[0], " -i input_file='AP Calculus Totals by Year.csv'")
sys.exit(0)
index +=1
print(__file__, "started", time.ctime()) # time stamp
print("Processing data from: ", input_file)
# read text CSV file (exported from spreadsheet)
df = pd.read_csv(input_file)
# drop NaNs for missing values in Pandas
df.dropna()
# get number of students who pass AP Calculus Exam (AB or BC)
# each year
df_ap_pass = pd.read_csv("AP Calculus Totals.csv")
ap_year = df_ap_pass.values[:,0]
ap_total = df_ap_pass.values[:,1]
# numerical data
hard_stem_str = df.values[1:,-1] # engineering, physical sciences, math/stat, CS
all_stem_str = df.values[1:,-2] # includes social science, psychology, agriculture etc.
hard_stem = np.zeros(hard_stem_str.shape)
all_stem = np.zeros(all_stem_str.shape)
for index, val in enumerate(hard_stem_str.ravel()):
if isinstance(val, str):
hard_stem[index] = np.float(val.replace(',',''))
elif isinstance(val, (float, np.float)):
hard_stem[index] = val
else:
raise TypeError("unsupported type " + str(type(val)))
for index, val in enumerate(all_stem_str.ravel()):
if isinstance(val, str):
all_stem[index] = np.float(val.replace(',', ''))
elif isinstance(val, (float, np.float)):
all_stem[index] = val
else:
raise TypeError("unsupported type " + str(type(val)))
DEGREES_PER_UNIT = 1000
# units are thousands of degrees granted
all_stem = DEGREES_PER_UNIT*all_stem
hard_stem = DEGREES_PER_UNIT*hard_stem
years_str = df.values[1:,0]
years = np.zeros(years_str.shape)
for index, val in enumerate(years_str.ravel()):
years[index] = np.float(val)
# almost everyone in the labor force graduated since 1970
# someone 18 years old in 1970 is 66 today (2018)
START_YEAR = 1970
def my_exp(x, *p):
"""
exponential model for curve_fit(...)
"""
return p[0]*np.exp(p[1]*(x - START_YEAR))
# starting guess for model parameters
p_start = [ 50000.0, 0.01 ]
# fit all STEM degree data
popt, pcov = opt.curve_fit(my_exp, years, all_stem, p_start)
# fit hard STEM degree data
popt_hard_stem, pcov_hard_stem = opt.curve_fit(my_exp, \
years, \
hard_stem, \
p_start)
# fit AP Students data
popt_ap, pcov_ap = opt.curve_fit(my_exp, \
ap_year, \
ap_total, \
p_start)
print(popt) # sanity check
STOP_YEAR = 2016
NYEARS = (STOP_YEAR - START_YEAR + 1)
years_fit = np.linspace(START_YEAR, STOP_YEAR, NYEARS)
n_fit = my_exp(years_fit, *popt)
n_pred = my_exp(years, *popt)
r2 = 1.0 - (n_pred - all_stem).var()/all_stem.var()
r2_str = "%4.3f" % r2
n_fit_hard = my_exp(years_fit, *popt_hard_stem)
n_pred_hard = my_exp(years, *popt_hard_stem)
r2_hard = 1.0 - (n_pred_hard - hard_stem).var()/hard_stem.var()
r2_hard_str = "%4.3f" % r2_hard
n_fit_ap = my_exp(years_fit, *popt_ap)
n_pred_ap = my_exp(ap_year, *popt_ap)
r2_ap = 1.0 - (n_pred_ap - ap_total).var()/ap_total.var()
r2_ap_str = "%4.3f" % r2_ap
cum_all_stem = n_fit.sum()
cum_hard_stem = n_fit_hard.sum()
cum_ap_stem = n_fit_ap.sum()
# to match BLS projections
future_years = np.linspace(2016, 2026, 11)
assert future_years.size == 11 # sanity check
future_students = my_exp(future_years, *popt)
future_students_hard = my_exp(future_years, *popt_hard_stem)
future_students_ap = my_exp(future_years, *popt_ap)
# https://fas.org/sgp/crs/misc/R43061.pdf
#
# The U.S. Science and Engineering Workforce: Recent, Current,
# and Projected Employment, Wages, and Unemployment
#
# by John F. Sargent Jr.
# Specialist in Science and Technology Policy
# November 2, 2017
#
# Congressional Research Service 7-5700 www.crs.gov R43061
#
# "In 2016, there were 6.9 million scientists and engineers (as
# defined in this report) employed in the United States, accounting
# for 4.9 % of total U.S. employment."
#
# BLS astonishing/bizarre projections for 2016-2026
# "The Bureau of Labor Statistics (BLS) projects that the number of S&E
# jobs will grow by 853,600 between 2016 and 2026 , a growth rate
# (1.1 % CAGR) that is somewhat faster than that of the overall
# workforce ( 0.7 %). In addition, BLS projects that 5.179 million
# scientists and engineers will be needed due to labor force exits and
# occupational transfers (referred to collectively as occupational
# separations ). BLS projects the total number of openings in S&E due to growth ,
# labor force exits, and occupational transfers between 2016 and 2026 to be
# 6.033 million, including 3.477 million in the computer occupations and
# 1.265 million in the engineering occupations."
# NOTE: This appears to project 5.170/6.9 or 75 percent!!!! of current STEM
# labor force LEAVE THE STEM PROFESSIONS by 2026!!!!
# "{:,}".format(value) to specify the comma separated thousands format
#
print("TOTAL Scientists and Engineers 2016:", "{:,.0f}".format(6.9e6))
# ALL STEM
print("\nALL STEM Bachelor's Degrees")
print("ESTIMATED TOTAL IN 2016 SINCE ", START_YEAR, ": ", \
"{:,.0f}".format(cum_all_stem), sep='')
# don't use comma grouping for years
print("TOTAL FROM", "{:.0f}".format(years_str[0]), \
"to 2015 (Science and Engineering Indicators 2018) ", \
"{:,.0f}".format(all_stem.sum()))
print("ESTIMATED FUTURE STUDENTS (2016 to 2026):", \
"{:,.0f}".format(future_students.sum()))
# annual growth rate of students taking AP Calculus
growth_rate_pct = (np.exp(popt[1]) - 1.0)*100
print("ANNUAL GROWTH RATE: ", "{:,.2f}".format(growth_rate_pct), \
"% US POPULATION GROWTH RATE (2016): 0.7 %")
# HARD STEM
print("\nHARD STEM DEGREES ONLY (Engineering, Physical Sciences, Math, CS)")
print("ESTIMATED TOTAL IN 2016 SINCE ", START_YEAR, ": ", \
"{:,.0f}".format(cum_hard_stem), sep='')
# don't use comma grouping for years
print("TOTAL FROM", "{:.0f}".format(years_str[0]), \
"to 2015 (Science and Engineering Indicators 2018) ", \
"{:,.0f}".format(hard_stem.sum()))
print("ESTIMATED FUTURE STUDENTS (2016 to 2026):", \
"{:,.0f}".format(future_students_hard.sum()))
# annual growth rate of students taking AP Calculus
growth_rate_pct_hard = (np.exp(popt_hard_stem[1]) - 1.0)*100
print("ANNUAL GROWTH RATE: ", "{:,.2f}".format(growth_rate_pct_hard), \
"% US POPULATION GROWTH RATE (2016): 0.7 %")
# AP STEM -- Students passing AP Calculus Exam Each Year
print("\nSTUDENTS PASSING AP CALCULUS EXAM")
print("ESTIMATED TOTAL IN 2016 SINCE ", START_YEAR, ": ", \
"{:,.0f}".format(cum_ap_stem), sep='')
# don't use comma grouping for years
print("TOTAL FROM", "{:.0f}".format(ap_year[-1]), \
"to", "{:.0f}".format(ap_year[0])," (College Board) ", \
"{:,.0f}".format(ap_total.sum()))
print("ESTIMATED FUTURE STUDENTS (2016 to 2026):", \
"{:,.0f}".format(future_students_ap.sum()))
# annual growth rate of students taking AP Calculus
growth_rate_pct_ap = (np.exp(popt_ap[1]) - 1.0)*100
print("ANNUAL GROWTH RATE: ", "{:,.2f}".format(growth_rate_pct_ap), \
"% US POPULATION GROWTH RATE (2016): 0.7 %")
# US Census reports 0.7 percent annual growth of US population in 2016
# SOURCE: https://www.census.gov/newsroom/press-releases/2016/cb16-214.html
#
f1 = plt.figure(figsize=(12,9))
ax = plt.gca()
# add commas to tick values (e.g. 1,000 instead of 1000)
ax.get_yaxis().set_major_formatter(
ticker.FuncFormatter(lambda x, p: format(int(x), ',')))
DOT_COM_CRASH = 2000.25 # usually dated march 10, 2000
OCT_2008_CRASH = 2008.75 # usually dated October 11, 2008
DELTA_LABEL_YEARS = 0.5
plt.plot(years_fit, n_fit, 'g', linewidth=3, label='ALL STEM FIT')
plt.plot(years, all_stem, 'bs', markersize=10, label='ALL STEM DATA')
plt.plot(years_fit, n_fit_hard, 'r', linewidth=3, label='HARD STEM FIT')
plt.plot(years, hard_stem, 'ms', markersize=10, label='HARD STEM DATA')
plt.plot(years_fit, n_fit_ap, 'k', linewidth=3, label='AP STEM FIT')
plt.plot(ap_year, ap_total, 'cd', markersize=10, label='AP STEM DATA')
[ylow, yhigh] = plt.ylim()
dy = yhigh - ylow
# add marker lines for crashes
plt.plot((DOT_COM_CRASH, DOT_COM_CRASH), (ylow+0.1*dy, yhigh), 'b-')
plt.text(DOT_COM_CRASH + DELTA_LABEL_YEARS, 0.9*yhigh, '<-- DOT COM CRASH')
# plt.arrow(...) add arrow (arrow does not render correctly)
plt.plot((OCT_2008_CRASH, OCT_2008_CRASH), (ylow+0.1*dy, 0.8*yhigh), 'b-')
plt.text(OCT_2008_CRASH+DELTA_LABEL_YEARS, 0.5*yhigh, '<-- 2008 CRASH')
plt.legend()
plt.title('STUDENTS STEM BACHELORS DEGREES (ALL R**2=' \
+ r2_str + ', HARD R**2=' + r2_hard_str + \
', AP R**2=' + r2_ap_str + ')')
plt.xlabel('YEAR')
plt.ylabel('TOTAL STEM BS DEGREES')
# appear to need to do this after the plots
# to get valid ranges
[xlow, xhigh] = plt.xlim()
[ylow, yhigh] = plt.ylim()
dx = xhigh - xlow
dy = yhigh - ylow
# put input data file name in lower right corner
plt.text(xlow + 0.65*dx, \
ylow + 0.05*dy, \
input_file, \
bbox=dict(facecolor='red', alpha=0.2))
plt.show()
f1.savefig('College_STEM_Degrees.jpg')
print(__file__, "ALL DONE")
It is common to encounter claims of a shortage or projected shortage of some number of millions of STEM (Science, Technology, Engineering and Mathematics) workers — sometimes phrased as millions of new jobs that may go unfilled due to the shortage of STEM workers. For example, the recent press release from Emerson Electric claiming a STEM worker shortage crisis includes the following paragraph:
While the survey found students today are twice as likely to study STEM fields compared to their parents, the number of roles requiring STEM expertise is growing at a rate that exceeds current workforce capacity. In manufacturing alone, the National Association of Manufacturing and Deloitte predict the U.S. will need to fill about 3.5 million jobs by 2025; yet as many as 2 million of those jobs may go unfilled, due to difficulty finding people with the skills in demand.
Here are the numbers for United States students achieving a score of at least 3 out of 5, considered a passing grade — qualified, on the Advanced Placement (AP) Calculus exam, either the AB or the more advanced BC Calculus exam from the College Board. BC Calculus is equivalent to a full first year calculus course at a top STEM university or college such as MIT or Caltech. A score of 3 on an AP exam officially means “qualified.”
Calculus is a challenging college level quantitative course. Being rated as qualified or better in calculus is a substantial accomplishment. Calculus is taken by most STEM students regardless of specific STEM degree or profession. Mastering calculus demonstrates motivation, hard work, and innate ability. Calculus is required for many STEM degrees and professions. Calculus is “good to know” for nearly all STEM degrees and professions, even if not strictly required.
The AP exams are standardized tests taken by students throughout the United States, thus removing concerns about the quality of grades and certifications from differing institutions and teachers.
In 2016, about 284,000 students scored 3 or higher on the AP Calculus Exam, either the AB or the more advanced BC exam. In total, just over three million students scored 3 or higher on the AP Calculus exams from 2002 through 2016.
Here are the results of fitting a simple exponential growth model to the data to estimate the number of students who have received a score of 3 or higher on the AP Calculus Exam each year since 1970:
This model estimates a total of just over five (5) million students scoring 3 or higher on the AP Calculus Exams since 1970 (48 years ago). The model has a coefficient of determination of 0.988, meaning only 1.2 percent of the variation in the data is unexplained by the model. This is excellent agreement between the model and data.
The model predicts that about 4.2 million students will take the AP Calculus Exam and score 3 or higher between 2016 and 2026 (far more than the 2 million and 3.5 million numbers quoted in the Emerson press release). This is in addition to the over 3 million students the College Board says took the exam and scored 3 or better between 2002 and 2016 and the estimated 2 million between 1970 and 2002.
The United States produces many qualified STEM students at the K-12 level!
In 2016, the United States Bureau of Labor Statistics (BLS), arguably a more reliable and authoritative source than the Emerson press release, predicted the total number of STEM jobs would grow by 853,600 jobs from 2016 to 2026 ( a ten year prediction). This is considerably less than the four million students expected to score 3 or higher on AP Calculus between 2016 and 2026.
The BLS estimates that there were 7.3 million science and engineering workers employed in 2016, with a projected increase to 8.2 million in 2026.
The BLS also predicted that an additional 1.439 million scientists and engineers will exit the labor force due to factors such as retirement, death, and to care for family members . This is plausible assuming the science and engineering workforce has ages roughly uniformly distributed between 22 (typical college graduation age) and 65 (typically retirement age). In ten years, about a quarter of the science and engineering workforce might be expected to retire, die, or leave to care for family members. One quarter of 7.3 million is about 1.825 million.
Taken together overall growth (835,000) and retirements, deaths, etc. (about 1.8 million) give a total of about 2.635 million openings, considerably lessthan the predicted four million students who will take AP Calculus and score 3 or higher between 2016 and 2026. In fact, extrapolating the 284,000 students who scored 3 or higher on the AP Calculus Exam in 2016 forward for ten years with no growth (an unrealistic assumption) still gives 2.8 million students, more than the projected number of openings.
Occupational Transfers
However, the BLS then introduces a remarkable, if not bizarre additional category of projected “openings.” Here is John Sargent of the Congressional Research Service (CRS)’s discussion of the BLS projections:
In addition to the job openings created by growth in the number of jobs in S&E occupations, BLS projects that an additional 1. 439 million scientists and engineers will exit the labor force due to factors such as retirement, death, and to care for family members . This brings the number of S&E job openings created by job growth and those exiting the workforce to nearly 2.3 million. In addition, BLS projects that there will be an additional 3.7 million openings created by occupational transfers in S&E positions during this period, that is , workers in S&E occupations who leave their jobs to take jobs in different occupations, S&E or non-S&E. The BLS projections do not include data that allow for a quantitative analysis of how many new workers (those not in the labor market in 2016) will be required for openings created by job growth, labor force exits, and occupational transfers , as there is no detail to how many of the S&E openings are expected to be filled by workers transferring into these openings from S&E occupations and from non-S&E occupations (that is, some workers may transfer from one S&E occupation to another, some may transfer from an S&E occupations to a non-S&E occupations, and still others may transfer from a non-S&E occupation into an S&E occupations ) . According to BLS, the projections methodology allows for multiple occupational transfers from the same position during the 10-year projection period, but only one occupational transfer in a given year.
The BLS appears to be claiming that at least 3.7 million allegedly rare and difficult to find, highly paid STEM workers, over one half of currently employed STEM workers (7.3 million) will, for some unexplained reason — not retirement, death, or caring for a loved one — leave their profession!
This bizarre unexplained projection, now totaling six (6) million openings, finally manages to exceed the estimated four million students who will probably take AP Calculus and score 3 or better on the AP exam between 2016 and 2026.
Without the mysterious “occupational transfers,” the numbers actually suggest overproduction, a glut of STEM students at the K-12 level (more students taking the AP Calculus Exams and scoring 3 or higher than future openings).
The US Census found using the most common definition of STEM jobs, total STEM employment in 2012 was 5.3 million workers (immigrant and native), but there are 12.1 million STEM degree holders (immigrant and native). There are many more STEM degree holders than students who took AP Calculus and scored at least 3 on the exam! A majority of STEM degree holders do not work in STEM professions.
What should one make of this? The BLS seems to be assuming an extremely high turnover rate in STEM workers, with at least fifty percent dropping out or being pushed out in only ten years. This assumption may then be used to argue for a shortage! The shortage would be due entirely to the mysterious unexplained “occupational transfers.”
Conclusion
K-12 schools in the United States produce large numbers of highly qualified STEM students who routinely take and pass the AP Calculus exam, either the AB exam or the more advanced BC exam. Remarkably these top students alone are nearly able to fill all existing STEM jobs, not including guest workers on H1-B or other guest worker visas and not including many late bloomers who first take calculus in college or even graduate school.
NOTE: The raw data on numbers of students taking and passing the AP Calculus exams each year from 2002 to 2016 are in the Comma Separated Values (CSV) format and the Python model fitting script used in the analysis above are given in the Appendix below. The data follows the Python script.
(C) 2018 by John F. McGowan, Ph.D.
About Me
John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).
Appendix
estimate_k12_stem.py
#
# estimate total production of STEM students at the
# K-12 level (pre-college)
#
# data and model for number of students who pass the
# AP Calculus exams (both AB and BC) from the College
# Board each year (data from 2002 to 2016)
#
# estimate total production of STEM students by K-12
# education from 1970 to 2016 (about 5 million estimated)
#
# versus about 12.1 million STEM degree holders in 2014
# and 5.3 million actual employed STEM workers in 2014
#
# Source: https://cis.org/There-STEM-Worker-Shortage
#
# (C) 2018 by John F. McGowan, Ph.D. (ceo@mathematical-software.com)
#
#
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
from mpl_toolkits.mplot3d import Axes3D
import scipy.optimize as opt
import pandas as pd
df = pd.read_csv(“AP Calculus Totals by Year.csv”)
df.dropna()
ab_str = df.values[:,2]
bc_str = df.values[:,4]
ab = np.zeros(ab_str.shape)
bc = np.zeros(bc_str.shape)
index = 0
for val in ab_str.ravel():
if isinstance(val, str):
ab[index] = np.float(val.replace(‘,’,”))
index += 1
index = 0
for val in bc_str.ravel():
if isinstance(val, str):
bc[index] = np.float(val.replace(‘,’, ”))
index += 1
temp = ab + bc
total = temp[:-2]
years_str = df.values[0:15,0]
years = np.zeros(years_str.shape)
for index in range(years.size):
years[index] = np.float(years_str[index])
# to match BLS projections
future_years = np.linspace(2016, 2026, 11)
assert future_years.size == 11
future_students = my_exp(future_years, *popt)
# https://fas.org/sgp/crs/misc/R43061.pdf
#
# The U.S. Science and Engineering Workforce: Recent, Current,
# and Projected Employment, Wages, and Unemployment
#
# by John F. Sargent Jr.
# Specialist in Science and Technology Policy
# November 2, 2017
#
# Congressional Research Service 7-5700 www.crs.gov R43061
#
# “In 2016, there were 6.9 million scientists and engineers (as
# defined in this report) employed in the United States, accounting
# for 4.9 % of total U.S. employment.”
#
# BLS astonishing/bizarre projections for 2016-2026
# “The Bureau of Labor Statistics (BLS) projects that the number of S&E
# jobs will grow by 853,600 between 2016 and 2026 , a growth rate
# (1.1 % CAGR) that is somewhat faster than that of the overall
# workforce ( 0.7 %). In addition, BLS projects that 5.179 million
# scientists and engineers will be needed due to labor force exits and
# occupational transfers (referred to collectively as occupational
# separations ). BLS projects the total number of openings in S&E due to growth ,
# labor force exits, and occupational transfers between 2016 and 2026 to be
# 6.033 million, including 3.477 million in the computer occupations and
# 1.265 million in the engineering occupations.”
# NOTE: This appears to project 5.170/6.9 or 75 percent!!!! of current STEM
# labor force LEAVE THE STEM PROFESSIONS by 2026!!!!
# “{:,}”.format(value) to specify the comma separated thousands format
#
print(“TOTAL Scientists and Engineers 2016:”, 6.9e6)
print(“ESTIMATED TOTAL IN 2016 SINCE “, START_YEAR, \
“{:,}”.format(cum_total))
print(“TOTAL FROM 2002 to 2016 (College Board Data) “, \
“{:,}”.format(total.sum()))
print(“ESTIMATED FUTURE STUDENTS (2016 to 2026):”, \
“{:,}”.format(future_students.sum()))
John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).
An article in the left-wing Mother Jones magazine on Indian students and the OPT program, using students at the University of Central Missouri as examples.
An article in Business Insider on the possible high turnover rate of many tech companies. It does not clearly separate the turnover rate and average duration of employment at a company. A company that is growing rapidly can have a low turnover rate and a low average duration of employment simply because so many employees are new. If a company doubles in size in two years, half its’ employees will have no more than two years of employment at the company.
Apple, for example, has been growing and hiring rapidly the last several years. Many employees are new which will pull down the average employment time. Having worked at Apple from 2014-2016, I suspect it does have a high turnover rate but it is hard to prove due to the apparent rapid growth of the company.
An article in The Guardian questioning Google and other Silicon Valley employer explanations for the low numbers of some groups in their companies, pointing to the large number and percentage of African Americans employees in software engineering in the Washington DC area — generally at government agencies such as NASA and government contractors.
It should be noted that the DC metro area is about 25 percent African-American whereas California as a whole is about 6.5 percent African-American. Of course, as the article points out, Google and many other tech companies recruit worldwide.
However, Hispanics with visible American Indian ancestry almost certainly make up over 30 percent of California and the San Francisco Bay Area’s population, a comparable or even larger fraction than African-Americans in the DC metro area. The US Census claims that 38.9 percent of people in California in 2016 were Hispanic-Latino. Probably 80 to 90 percent of these have visible American Indian ancestry.
The US Census relies on self-identification for race rather than visible appearance. Hispanics self-identify as white, mixed race, “other race,” and sometimes American Indian/Native American. My personal impression is that genuine discrimination tends to follow visible appearance and accent/spoken dialect of English.
Hispanic is not a racial category, including people who are entirely European and indeed Northern European in appearance. At least in my personal experience, most — not all — Hispanics in leadership and engineering positions at high tech companies like Google are European in appearance. On its diversity web site, Google claims that 4 percent of its workforce in 2017 are Hispanic.
The article discusses an internal Google spreadsheet set up by a now former Google employee with self-reported salary and bonus information from Google employees showing women paid less than men. There is also discussion of the current Labor Department investigation into disparities in salaries between men and women at Google as well as activist investors pressuring Google to disclose information on the salaries of men and women at Google.
An article noting the obvious inconsistency between the many layoff announcements in high tech and the claims of a shortage of STEM workers, often by the same employers.
An article claiming discontent over the new open office plans at Apple’s new headquarters — the “Spaceship” — in Cupertino.
(C) 2017 John F. McGowan, Ph.D.
About the author
John F. McGowan, Ph.D. solves problems using mathematics and mathematical software, including developing gesture recognition for touch devices, video compression and speech recognition technologies. He has extensive experience developing software in C, C++, MATLAB, Python, Visual Basic and many other programming languages. He has been a Visiting Scholar at HP Labs developing computer vision algorithms and software for mobile devices. He has worked as a contractor at NASA Ames Research Center involved in the research and development of image and video processing algorithms and technology. He has published articles on the origin and evolution of life, the exploration of Mars (anticipating the discovery of methane on Mars), and cheap access to space. He has a Ph.D. in physics from the University of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Technology (Caltech).