- Home
- Agencies
- Department of Agriculture
- Department of Housing and Urban Development
- General Services Administration
- Department of Commerce
- Department of the Interior
- National Aeronautics and Space Administration
- Department of Defense
- Department of Justice
- National Science Foundation
- Department of Education
- Department of Labor
- Office of Personnel Management
- Department of Energy
- Department of State
- Small Business Administration
- Environmental Protection Agency
- Department of Transportation
- Social Security Administration
- Department of Health and Human Services
- Department of the Treasury
- U.S. Agency for International Development
- Department of Homeland Security
- Department of Veterans Affairs
- Goals
- Initiatives
- Programs
Primary tabs
FY 14-15: Agency Priority Goal
Increase the Nation’s Data Science Capacity
Priority Goal
Goal Overview
Innovative information technologies are transforming the fabric of society, and data represent a transformative new currency for science, education, government, and commerce. Data are everywhere; they are produced in rapidly increasing volume and variety by virtually all scientific, educational, governmental, societal and commercial enterprises. (For more information see “Dealing with Data,” Science Magazine, Volume 331, February 11, 2011.)
Today we live in an era of data and information. This era is enabled by modern experimental methods and observational studies; large-scale simulations; scientific instruments, such as telescopes and particle accelerators; Internet transactions, email, videos, images, and click streams; and the widespread deployment of sensors everywhere – in the environment, in our critical infrastructure, such as in bridges and smart grids, in our homes, and even on our clothing. Every day, 2.5 quintillion bytes of data are generated – so much that 90 percent of the data in the world today has been created in the last two years alone (http://www-01.ibm.com/software/data/bigdata/).
It is important to note that when we talk about big data it is not just the enormous volume of data that needs to be emphasized, but also the heterogeneity, velocity, and complexity that collectively create the science and engineering challenges we face today.
In December 2010, the President’s Council of Advisors on Science and Technology (PCAST) published a report to the President and Congress entitled: Designing a Digital Future: Federally Funded Research and Development in Networking and Information Technology. (http://www.whitehouse.gov/sites/default/files/microsites/ostp/pcast-nitrd-report-2010.pdf) In that report, PCAST pointed to the research challenges involved in large-scale data management and analysis and the critical role of Networking and Information Technology (NIT) in moving from data to knowledge to action, underpinning the Nation’s future prosperity, health and security.
Through long-term, sustained investments in foundational computing, communications and computational research, and the development and deployment of large-scale facilities and cyberinfrastructure, federal agency R&D investments over the past several decades have both helped generate this explosion of data as well as advance our ability to capture, store, analyze, and use these data for societal benefit. More specifically, we have seen fundamental advances in machine learning, knowledge representation, natural language processing, information retrieval and integration, network analytics, computer vision, and data visualization, which together have enabled Big Data applications and systems that have the potential to transform all aspects of our lives.
These investments are already starting to pay off, demonstrating the power of Big Data approaches across science, engineering, medicine, commerce, education, and national security, and laying the foundations for U.S. competitiveness for many decades to come. But much more needs to be done, particularly in four areas: 1) basic research; 2) data infrastructure; 3) education and workforce development; and 4) community outreach.
NSF can catalyze progress in these areas by developing programs to engage the research community, and by creating mechanisms to catalyze the development of people and infrastructure to address the challenges posed by this new flood of data.
NSF will help increase the number of data scientists engaged in academic research, development, and implementation. As defined in the 2005 NSB publication of Long-lived Digital Data Collections: Enabling Research and Education in the 21st Century defines data scientists as “the information and computer scientists, database and software programmers, disciplinary experts, curators, and expert annotators, librarians, archivists and others, who are crucial to the successful management of a digital data collection.”
Using its ability to convene diverse sets of stakeholders, NSF will promote multi-stakeholder partnerships by supporting workshops and follow-on activities that bring together representatives of industry, academia, not-for-profit organizations, and other entities to address current and future big-data challenges. NSF will also leverage existing programs, such as the NSF Research Traineeship (NRT) and the Graduate Research Fellowship (GRF) programs, and create new programs and tracks to current programs, as needed, to support the creation of more researchers and students competent in the deep analytical and technical skills required to address those challenges.
NSF will develop strategies to build and sustain data infrastructure for the 21st century through CIF21.
NSF will coordinate with other agencies through the National Science and Technology Council to achieve this goal.
Strategies
NSF's strategy to increase the Nation's data science capacity has three parallel processes: Human Capital Development, Partnerships, and Infrastructure.
Parallel Process 01: Human Capital Development, "implement mechanisms to support the training and workforce development of future data scientists"
NSF will address the issues of big-data workforce development internally by modifying current emphasis areas or by adding programs to big-data tracks to existing programs. In particular, NSF will use one or more of the following programs for students or recent PhD graduates to gain experience through data-intensive science and engineering projects:
- Innovative Technology Experiences for Students and Teachers (ITEST);
- Improving Undergraduate STEM Education;
- NSF’s Research Traineeship (NRT) program;
- The Graduate Research Fellowship (GRF) program;
- Recruit AAAS Fellows in the Data Science track;
- Critical Techniques and Technologies for Advancing Big Data Science & Engineering (BIGDATA);
- Data Infrastructure Building Blocks (DIBBs);
- EarthCube, Building Collaborative Communities, and other community building activities for data-intensive projects/programs; and
- CDS&E activities that can be leveraged to create opportunities for students and faculty to develop the skills and expertise needed to engage in data science.
As new programs and activities come on line in FY 2014 and FY 2015, NSF will look for opportunities to incorporate training and preparation of data scientists at all stages of a researcher’s career.
NSF will fund a conference or workshop for graduate students who are studying data science, from across IGERT/NRT, SLCs, GRF, and other programs in FY 2014.
A monitoring contract will be planned to gather data on the career pathways of students entering NSF funded programs to study data science.
In FY 2014, a mechanism will be developed for tracking applications to the GRFP and NRT programs that indicate research interest in data science. NSF will use this mechanism to monitor changes in student interest in data science fields in FY 2015.
Indicators in support of the first process:
The acceptance of “data scientist” as a professional category in academia, industry, and government:
- Baseline: establish verifiable baselines for undergraduate, certificate, and graduate programs by September 30, 2014.
- Target: 25% increase in the number of degree and concentration, and certificate programs in data science in U.S. universities by 2015.
The number of NSF solicitations that include emphasis on the preparation of data scientists:
- Inventory NSF solicitations that could appropriately include an emphasis on the preparation of data scientists by June 30, 2014.
- Introduce language emphasizing interest in preparing data scientists in 75% of solicitations that could appropriately do so by September 30, 2015.
Parallel Process 10: Partnerships, "increase the number of multi-stakeholder partnerships to address the nation’s big-data challenges"
Internally, NSF will develop strategies and pilot activities within current programs to pull together industry and academic partners to engage in national big data challenges (e.g., I/UCRC, Big Data Hubs for center-scale projects.)
Externally, NSF will sponsor workshops and other activities to engage potential stakeholders in building multi-stakeholder partnerships. A workshop planned for FY 2014 is intended to maintain and build on partnerships announced at a major, multi-agency big-data event in the fall. This workshop will inform what specific external activities NSF will support in FY 2014.
NSF plans to host or support two additional partnership-building workshops over the timespan of this Priority Goal (FY 2014-FY 2015) that produce reports identifying emerging data science and big data needs.
The number and/or quality of multi-stakeholder partnerships created to address big-data challenges:
- NSF was the leader in an initiative to develop and implement novel, multi-stakeholder partnerships that promise progress in Big Data. In the White House sponsored Data to Knowledge to Action event of FY 2014, 90 partners and 30 partnerships were announced. To expand this number of Big Data partnerships in a sustainable manner, NSF will plan to fund Big Data Regional Innovation Hubs in FY 2015. These Hubs will work to expand Big Data discovery, education, and innovation through support of multi-stakeholder partnerships that exist at the time of the award and by facilitation of the formation of additional partnerships. NSF will quantify the number of partnerships and partners generated by funded Hubs to the extent possible within the timeframe of the Priority Goal.
Parallel process 11: Infrastructure, "increase investments in current and future data infrastructure extending data–intensive science into more research communities"
NSF will ensure that the DIBBS and BIGDATA programs are strategically positioned to support the development of new data infrastructure.
Indicator in support of the third process (three is rendered “11” in binary):
The number of communities/organizations/ecosystems that use data infrastructure and tools for their R&D activities:.
- In FY 2014, NSF will establish a baseline of NSF-funded infrastructure projects by discipline.
- By September 30, 2015, increase the number of disciplines supported by NSF data infrastructure programs by approximately 12.
Progress Update
Final Update
NSF achieved this Priority Goal by making progress in three areas: human capital development, partnerships, and infrastructure.
In the area of human capital development, NSF sought to implement mechanisms to support the training and workforce development of future data scientists. NSF:
- Incorporated language encouraging data science education and training in 18 solicitations:
- AitF
- BCC
- BIGDATA
- CC*DNI
- CRII
- CPS
- CyberSEES
- EarthCube
- ECR
- EXTREEMS-QED
- GRFP
- HBCU-UP
- NRT
- NRI
- SaTC
- SFS
- STEMC
- XPS
- Funded workshops for the community:
- NAS Workshop: Training Students to Extract Value from Big Data, April 2014.
- Advancing Data-Intensive Research in Education, June 2015.
- Graduate Data Science Workshop, August 2015. (http://depts.washington.edu/dswkshp/)
- Tracked applications to graduate fellowship programs from students interested in data science. Between FY 2014 and FY 2015, the number of applicants to the Graduate Research Fellowship program in data science fields increased by over 60 percent, and the number of awardees increased over 35 percent.
In the area of partnerships, NSF sought to increase the number of multi-stakeholder partnerships to address the nation’s Big Data challenges. NSF’s major accomplishment was the establishment of four Big Data Innovation Hubs across the Nation. These Hubs will work to expand Big Data discovery, education, and innovation through support of multi-stakeholder partnerships that exist at the time of the award and by facilitation of the formation of additional partnerships.
In the area of infrastructure, NSF sought to increase investments in current and future data infrastructure extending data-intensive science into more research communities.
- The BCC, DIBBS, and BIGDATA programs, which support the development of new data infrastructure, supplied a total of $110.3 million in funding to 86 projects in FY 2014 and FY 2015.
- Compared to FY 2013, FY 2015 usage of NSF-funded Extreme Science and Engineering Discovery Environment (XSEDE, https://www.xsede.org/)’s data intensive resources rose by 30 percent (from 519 million to 673 million Service Units, https://portal.xsede.org/knowledge-base/-/kb/document/bazo). The number of scientific disciplines using XSEDE rose by 25 percent (from 28 to 35 disciplines).
Previous Updates
In the third quarter of FY 2015, NSF made progress in three areas:
Human Capital Development:
|
Partnerships: The Big Data Innovation Hubs website (https://bdhub.info), launched in Q2, 2015, was used by stakeholders to collaborate and draft proposals for the program. The deadline for submission of proposals was June 24, 2015 and four proposals were submitted. Each region had over eighty partners with the largest number of partners in North East Hub. NSF will fund four Hubs in Q4, 2015 to support partnerships that strive to achieve common Big Data goals that would not be possible to achieve alone.
|
Infrastructure: In an effort to measure the number of disciplines that use data infrastructure and tools for their research and development (R&D) activities, NSF has gathered baseline data for data-intensive HPC resource usage through XSESDE (Extreme Science and Engineering Discovery Environment). NSF will continue to measure HPC usage to identify trends in data-intensive usage among disciplines. |
In the second quarter of FY 2015, NSF made progress in three areas:
Human Capital Development:
|
Partnerships:
There are currently 294 registered BDHubs partners. Charrettes materials can be found at https://bdhub.info. |
Infrastructure:
|
In the first quarter of FY 2015, NSF made progress in two areas.
Human Capital Development:
|
Partnerships:
|
In the fourth quarter of FY 2014, NSF made progress in three areas.
Human Capital Development:
|
Partnerships:
|
Infrastructure:
|
In the third quarter of FY 2014, NSF made progress in three areas.
Human Capital Development:
|
Partnerships:
|
Infrastructure:
|
In the first half of FY 2014, NSF made progress in three areas.
Human Capital Development:
|
Partnerships:
|
Infrastructure:
|
Next Steps
No Data Available
Expand All
Performance Indicators
Number of undergraduate, certificate, and graduate programs in data science
Contributing Programs & Other Factors
Within NSF, this effort is led by the Directorates for Computer and Information Sciences (CISE) and Education and Human Resources (EHR). All research directorates participate in programs such as NRT and DIBBS.
NSF is one of a number of agencies that participate in Big Data coordination through the National Science and Technology Council's Networking and Information Technology Research and Development (NITRD) Program. The others are:
- Agency for Healthcare Research and Quality
- Defense Advanced Research Projects Agency
- Department of Homeland Security
- Department of Energy - National Nuclear Security Administration
- Department of Energy – Office of Electricity Delivery and Energy Reliability
- Department of Energy - Office of Science
- Environmental Protection Agency
- Department of Health and Human Services - Office of the National Coordinator for Health Information Technology
- National Archives and Records Administration
- National Aeronautics and Space Administration
- National Institutes of Health
- National Institute of Standards and Technology
- National Oceanic and Atmospheric Administration
- National Reconnaissance Office
- National Security Agency
- National Science Foundation
- DoD Service Research Organizations
Expand All
Strategic Goals
Strategic Goal:
Stimulate Innovation and Address Societal Needs through Research and Education
Statement:
Stimulate Innovation and Address Societal Needs through Research and Education
Strategic Objectives
Statement:
Strengthen the links between fundamental research and societal needs through investments and partnerships
Description:
The first part of NSF’s mission, as expanded by the first strategic goal, is to create new knowledge and expand the Nation’s intellectual capital. However, NSF's mission does not end there. We also must connect new knowledge to innovations that address societal needs above and beyond the need for advancement in science. This strategic objective is aimed at developing connections between new insights and global challenges (often involving essential interdisciplinary collaborations, prototypes, and technologies). It also entails educating a workforce capable of using and adapting discoveries to meet society’s needs.
One approach to developing these connections is through partnerships involving other government agencies and private and international entities. Such partnerships leverage NSF resources and help ensure that fundamental research outcomes are translated into benefits to society.
Statement:
Build the capacity of the Nation to address societal challenges using a suite of formal, informal, and broadly available STEM educational mechanisms.
Description:
NSF has the opportunity and responsibility to leverage our research and education activities to engage the public and help citizens develop a better understanding of science-- one that can inform opinions about issues faced in daily living, in participation in the democratic process, and in helping to advance science. Formal education through the Nation’s K-12 schools provides the foundation for citizens’ understanding of STEM and its uses in addressing the needs of society. This learning continues for those who further their education in the Nation’s colleges and universities. Informal education is another powerful means to provide learning and instill interest in STEM topics in everyone throughout their lives. Technology holds promise for new pathways to learning, including personalized learning. By investing in research and development on STEM education and learning, NSF extends the reach of our programs to the public.
Agency Priority Goals
Statement:
Improve the Nation’s capacity in data science by investing in the development of human capital and infrastructure.
By September 30th, 2015, implement mechanisms to support the training and workforce development of future data scientists; increase the number of multi-stakeholder partnerships to address the nation’s big-data challenges; and increase investments in current and future data infrastructure extending data –intensive science into more research communities.
Description:
Innovative information technologies are transforming the fabric of society, and data represent a transformative new currency for science, education, government, and commerce. Data are everywhere; they are produced in rapidly increasing volume and variety by virtually all scientific, educational, governmental, societal and commercial enterprises. (For more information see “Dealing with Data,” Science Magazine, Volume 331, February 11, 2011.)
Today we live in an era of data and information. This era is enabled by modern experimental methods and observational studies; large-scale simulations; scientific instruments, such as telescopes and particle accelerators; Internet transactions, email, videos, images, and click streams; and the widespread deployment of sensors everywhere – in the environment, in our critical infrastructure, such as in bridges and smart grids, in our homes, and even on our clothing. Every day, 2.5 quintillion bytes of data are generated – so much that 90 percent of the data in the world today has been created in the last two years alone (http://www-01.ibm.com/software/data/bigdata/).
It is important to note that when we talk about big data it is not just the enormous volume of data that needs to be emphasized, but also the heterogeneity, velocity, and complexity that collectively create the science and engineering challenges we face today.
In December 2010, the President’s Council of Advisors on Science and Technology (PCAST) published a report to the President and Congress entitled: Designing a Digital Future: Federally Funded Research and Development in Networking and Information Technology. (http://www.whitehouse.gov/sites/default/files/microsites/ostp/pcast-nitrd-report-2010.pdf) In that report, PCAST pointed to the research challenges involved in large-scale data management and analysis and the critical role of Networking and Information Technology (NIT) in moving from data to knowledge to action, underpinning the Nation’s future prosperity, health and security.
Through long-term, sustained investments in foundational computing, communications and computational research, and the development and deployment of large-scale facilities and cyberinfrastructure, federal agency R&D investments over the past several decades have both helped generate this explosion of data as well as advance our ability to capture, store, analyze, and use these data for societal benefit. More specifically, we have seen fundamental advances in machine learning, knowledge representation, natural language processing, information retrieval and integration, network analytics, computer vision, and data visualization, which together have enabled Big Data applications and systems that have the potential to transform all aspects of our lives.
These investments are already starting to pay off, demonstrating the power of Big Data approaches across science, engineering, medicine, commerce, education, and national security, and laying the foundations for U.S. competitiveness for many decades to come. But much more needs to be done, particularly in four areas: 1) basic research; 2) data infrastructure; 3) education and workforce development; and 4) community outreach.
NSF can catalyze progress in these areas by developing programs to engage the research community, and by creating mechanisms to catalyze the development of people and infrastructure to address the challenges posed by this new flood of data.
NSF will help increase the number of data scientists engaged in academic research, development, and implementation. As defined in the 2005 NSB publication of Long-lived Digital Data Collections: Enabling Research and Education in the 21st Century defines data scientists as “the information and computer scientists, database and software programmers, disciplinary experts, curators, and expert annotators, librarians, archivists and others, who are crucial to the successful management of a digital data collection.”
Using its ability to convene diverse sets of stakeholders, NSF will promote multi-stakeholder partnerships by supporting workshops and follow-on activities that bring together representatives of industry, academia, not-for-profit organizations, and other entities to address current and future big-data challenges. NSF will also leverage existing programs, such as the NSF Research Traineeship (NRT) and the Graduate Research Fellowship (GRF) programs, and create new programs and tracks to current programs, as needed, to support the creation of more researchers and students competent in the deep analytical and technical skills required to address those challenges.
NSF will develop strategies to build and sustain data infrastructure for the 21st century through CIF21.
NSF will coordinate with other agencies through the National Science and Technology Council to achieve this goal.
Strategic Objectives
Strategic Objective:
Statement:
Strengthen the links between fundamental research and societal needs through investments and partnerships
Description:
The first part of NSF’s mission, as expanded by the first strategic goal, is to create new knowledge and expand the Nation’s intellectual capital. However, NSF's mission does not end there. We also must connect new knowledge to innovations that address societal needs above and beyond the need for advancement in science. This strategic objective is aimed at developing connections between new insights and global challenges (often involving essential interdisciplinary collaborations, prototypes, and technologies). It also entails educating a workforce capable of using and adapting discoveries to meet society’s needs.
One approach to developing these connections is through partnerships involving other government agencies and private and international entities. Such partnerships leverage NSF resources and help ensure that fundamental research outcomes are translated into benefits to society.
Agency Priority Goals
Statement: Improve the Nation’s capacity in data science by investing in the development of human capital and infrastructure. By September 30th, 2015, implement mechanisms to support the training and workforce development of future data scientists; increase the number of multi-stakeholder partnerships to address the nation’s big-data challenges; and increase investments in current and future data infrastructure extending data –intensive science into more research communities.
Description: Innovative information technologies are transforming the fabric of society, and data represent a transformative new currency for science, education, government, and commerce. Data are everywhere; they are produced in rapidly increasing volume and variety by virtually all scientific, educational, governmental, societal and commercial enterprises. (For more information see “Dealing with Data,” Science Magazine, Volume 331, February 11, 2011.) Today we live in an era of data and information. This era is enabled by modern experimental methods and observational studies; large-scale simulations; scientific instruments, such as telescopes and particle accelerators; Internet transactions, email, videos, images, and click streams; and the widespread deployment of sensors everywhere – in the environment, in our critical infrastructure, such as in bridges and smart grids, in our homes, and even on our clothing. Every day, 2.5 quintillion bytes of data are generated – so much that 90 percent of the data in the world today has been created in the last two years alone (http://www-01.ibm.com/software/data/bigdata/). It is important to note that when we talk about big data it is not just the enormous volume of data that needs to be emphasized, but also the heterogeneity, velocity, and complexity that collectively create the science and engineering challenges we face today. In December 2010, the President’s Council of Advisors on Science and Technology (PCAST) published a report to the President and Congress entitled: Designing a Digital Future: Federally Funded Research and Development in Networking and Information Technology. (http://www.whitehouse.gov/sites/default/files/microsites/ostp/pcast-nitrd-report-2010.pdf) In that report, PCAST pointed to the research challenges involved in large-scale data management and analysis and the critical role of Networking and Information Technology (NIT) in moving from data to knowledge to action, underpinning the Nation’s future prosperity, health and security. Through long-term, sustained investments in foundational computing, communications and computational research, and the development and deployment of large-scale facilities and cyberinfrastructure, federal agency R&D investments over the past several decades have both helped generate this explosion of data as well as advance our ability to capture, store, analyze, and use these data for societal benefit. More specifically, we have seen fundamental advances in machine learning, knowledge representation, natural language processing, information retrieval and integration, network analytics, computer vision, and data visualization, which together have enabled Big Data applications and systems that have the potential to transform all aspects of our lives. These investments are already starting to pay off, demonstrating the power of Big Data approaches across science, engineering, medicine, commerce, education, and national security, and laying the foundations for U.S. competitiveness for many decades to come. But much more needs to be done, particularly in four areas: 1) basic research; 2) data infrastructure; 3) education and workforce development; and 4) community outreach. NSF can catalyze progress in these areas by developing programs to engage the research community, and by creating mechanisms to catalyze the development of people and infrastructure to address the challenges posed by this new flood of data. NSF will help increase the number of data scientists engaged in academic research, development, and implementation. As defined in the 2005 NSB publication of Long-lived Digital Data Collections: Enabling Research and Education in the 21st Century defines data scientists as “the information and computer scientists, database and software programmers, disciplinary experts, curators, and expert annotators, librarians, archivists and others, who are crucial to the successful management of a digital data collection.” Using its ability to convene diverse sets of stakeholders, NSF will promote multi-stakeholder partnerships by supporting workshops and follow-on activities that bring together representatives of industry, academia, not-for-profit organizations, and other entities to address current and future big-data challenges. NSF will also leverage existing programs, such as the NSF Research Traineeship (NRT) and the Graduate Research Fellowship (GRF) programs, and create new programs and tracks to current programs, as needed, to support the creation of more researchers and students competent in the deep analytical and technical skills required to address those challenges. NSF will develop strategies to build and sustain data infrastructure for the 21st century through CIF21. NSF will coordinate with other agencies through the National Science and Technology Council to achieve this goal.
Strategic Objective:
Statement:
Integrate education and research to support development of a diverse STEM workforce with cutting-edge capabilities
Description:
The global competitiveness of the United States in the 21st century depends directly on the readiness of the Nation’s STEM workforce. Educational institutions around the country must recruit, train, and prepare a diverse STEM workforce to advance the frontiers of science and participate in the U.S. technology-based economy. One of NSF’s most enduring contributions to the national innovation ecosystem is the integration of education and research in the activities we support. When students participate in cutting-edge research activities under the guidance of the Nation’s most creative scientists and engineers, the students can gain the up-to-date knowledge and practical, hands-on experience needed to develop into creative contributors who can engage in innovative activities throughout all sectors of society. The successive cadres of high-tech workers, each armed with practical knowledge of the most advanced thinking and technology of the day, create the flow of highly adaptable human capital needed to power discovery and innovation. NSF also supports the development of a strong STEM workforce by investing in building the knowledge that informs improvements in STEM teaching and learning. Such improvements include effective curricular and teaching strategies for increased student learning, as well as new approaches enabled by advanced classroom technologies. Investments in social science and education research in learning, teaching, and institutions can have major impacts when derived insights are applied to the education of the STEM workforce.
The transformation of the frontiers of science and engineering requires dramatic change in the diversity of S&E communities. The demographic evolution in the United States is reflected in a strong, growing workforce whose makeup is changing rapidly. Women and members of minority groups represent an expanding portion of the country’s potential intellectual capital. NSF is committed to increasing access for currently underrepresented groups to STEM education and careers through our investments in research and education. The resulting enhancement of diversity is essential to provide the strength that comes from diverse perspectives, as well as to assure development of the Nation’s intellectual capital.
Agency Priority Goals
Statement: Improve STEM graduate student preparedness for entering the workforce. By September 30, 2017, NSF will fund at least three summer institutes and 75 supplements to existing awards to provide STEM doctoral students with opportunities to expand their knowledge and skills to prepare for a range of careers.
Description: A strong global economy is reliant on the ability to capitalize on technical innovations that result from a skilled and agile STEM workforce. As a result, the Nation’s scientific workforce must evolve and mature to include more doctoral level researchers in positions outside of academia. These positions require comprehensive preparation in science at the graduate level, as well as proficiency in other critical skills. Surveys of graduate students analyzed in recent reports have demonstrated that graduate student training has not kept pace with STEM workforce needs beyond traditional roles in academia. In recent years there has been a shift in the job market for science and engineering doctorate holder that has resulted in more varied career choices. Scientists and engineers with doctorates are now more evenly split between the business sector (45%) and the education sector (46%) (Source: Survey of Earned Doctorates, National Science Foundation, National Center for Science and Engineering Statistics 2013). Within the education sector, over 90 percent of doctorates are employed at 4-year institutions. However, Ph.D. training remains largely focused on preparation for the research component of academic careers with an emphasis on skills needed at research institutions. There is considerable value to traditional academic training, which can provide doctoral graduates with experience in critical thinking as well as oral and written communication The purpose of this Priority Goal is to provide opportunities for science and engineering doctoral students so they can acquire the knowledge, experience, and skills needed for highly productive careers, inside and outside of academe. Although investments in the Graduate Research Internship Program (GRIP) and Graduate Research Opportunities Worldwide (GROW) provide support across disciplines that help address this issue, a larger, agency-wide effort directed at the specific goal of determining effective approaches to increased graduate student preparedness is needed. The activities in this APG will be undertaken in coordination with NSF’s forthcoming strategic plan for investment in graduate students and graduate education. In addition, these approaches will be reported at the NSTC Committee on STEM Education’s Interagency Working Group on Graduate Education (co-chaired by NSF and NIH) as a possible model for consideration by other agencies. The activities in this APG will be undertaken in coordination with NSF’s forthcoming strategic framework for investment in graduate education. In addition, these approaches will be reported at the National Science and Technology Council (NSTC) Committee on STEM Education’s Interagency Working Group on Graduate Education (co-chaired by NSF and NIH). The NSTC Committee on STEM notes that “tomorrow’s STEM workforce will need to include effective change makers and entrepreneurs in business, public service, civil society, and academia. Some universities are encouraging students to set and meet more ambitious goals for their research, education, and service; giving students greater autonomy earlier in their career; connecting students to real-world problems at a regional, national, and global level; and involving students in the design of university curricula, research initiatives, and collaborations with external partners.” This APG will explore ways to partner with universities to identify and spread promising practices for achieving this vision.
Statement: Build the capacity of the Nation to solve research challenges and improve learning by investing strategically in crowdsourcing and other forms of public participation in science, technology, engineering, and mathematics research (PPSR). By September 30, 2017, NSF will implement mechanisms to expand and deepen the engagement of the public in research.
Description: Problem/Opportunity: Scientists, mathematicians, and engineers have involved the public in their research efforts to solve challenging problems for centuries in a variety of fields. For example, daily precipitation data collected by volunteers throughout the US have been used to develop more accurate, fine-grained models that improve weather forecasting, agriculture, and disaster risk analyses. Water quality and wildlife monitoring projects allow communities to understand their local environments in systematic ways and allow them to compare their findings with those from other areas. These types of activities have been referred to in a variety of ways. For this Agency Priority Goal, "Public Participation in Science, technology, engineering, and mathematics Research" (PPSR) is used as an overarching term that includes citizen science, crowdsourcing research, and similar activities. PPSR has grown significantly in the past decade, in part due to new technological tools that facilitate interactions between scientists and participants. There are a number of economic, societal, and technological trends that are increasing the variety and value of what public participation in research can accomplish. These trends include: the democratization of the tools needed to design and make a variety of items; the Maker Movement; the emergence of online communities with shared interests in projects such as exploration of diverse fields of science, technology, engineering, and mathematics (STEM) by members of the public; and crowdfunding platforms that allow teams to raise funding for their projects. New technological tools also have facilitated crowdsourcing research, a process in which open calls are made for voluntary contributions to STEM problem-solving. These calls are typically either to a non-specified group of individuals ("the crowd") or to individuals with specific expertise, thus leveraging the skills and knowledge of many. Without public participants and their contributions, some STEM research that addresses challenging problems would not be practical or even possible, e.g., projects mandating data collection from many geographical locations or over long periods of time or projects that require expertise for analysis of data as well as large sets of visual or numeric data. PPSR approaches hold promise to continue to address new research questions and contribute to ongoing STEM research. Moreover, citizen science and crowdsourcing research provide opportunities for the broadest possible participation in learning how STEM research is done and in engaging in it directly. Participants include individuals from urban, suburban and rural communities; diverse economic, geographic, racial, ethnic, gender, and linguistic groups; and individuals with a range of abilities and disabilities. The motivation for PPSR may be derived from community concerns or may be researcher-led. The level of public involvement varies from being contributory (e.g., collecting and recording data) to collaborative (e.g., analyzing samples and discussing results) to co-created (in which the public might be involved in all phases of the scientific process from defining the question for investigation, to experimenting, analyzing, and reporting). Thus, people with various interests and abilities are often able to participate and contribute productively. With the opportunity to reach more people and therefore collect and analyze data sets more extensively than possible through the efforts of scientists alone, PPSR may go beyond simply enhancing our ability to do traditional STEM research better. Citizen science and crowdsourcing science enable us to pursue entirely new avenues of research and development that can only be achieved through public-scientist collaborations. The different perspectives and habits of mind that public participants can bring to bear on the interpretation of data may also open new avenues of research and development. Over the past decade, NSF has funded hundreds of STEM research projects that rely on PPSR across a diverse array of fields. The scope of PPSR is broad and encompasses geosciences and biological sciences, technology and engineering, social and behavioral sciences, education, computer and information sciences, and physical sciences. These projects collectively have created a strong foundation for future PPSR activities and have identified areas for potential improvement and expansion. The next phase of NSF investments will expand beyond project-by-project approaches to explore underlying issues and areas for innovation. In particular, this next phase could help identify: new research challenges that might be addressed using PPSR; new PPSR-enabling technology; social aspects of working with the public; effective PPSR program design; learning experience facilitated by PPSR; ways in which PPSR can broaden participation in STEM; and a myriad of data-related issues, including data quality and collection, data management, visualization, and data ownership models. This phase of investments should also prompt the broader community to tackle long-standing but unresolved STEM challenges and to open doors to new STEM research areas. To achieve this Agency Priority Goal NSF will use three specific mechanisms to fund proposals that explicitly include PPSR approaches: Research Coordination Networks (RCNs), EArly-concept Grants for Exploratory Research (EAGERs), and supplements to existing awards. Research Coordination Networks support communication and coordination across disciplinary, organizational, institutional, and geographic boundaries, thus facilitating ongoing activities above the project level. EAGERs are designed as "high risk-high payoff" awards. These types of awards will likely push our collective understandings of how PPSR is leveraged to support scientific discovery and the public's engagement with research. Supplements to existing awards provide opportunities to (1) include PPSR approaches in projects that are appropriate for PPSR but haven't already incorporated PPSR approaches and (2) for other projects to deepen their use of PPSR approaches. This Agency Priority Goal also takes advantage of the Executive Branch's momentum in this area. For example, the White House honored Citizen Science Champions of Change and included citizen science projects and opportunities in its recent science fair. Office of Science and Technology Policy (OSTP) rolled out a new toolkit for federal-sponsored PPSR projects on September 30, 2015 and issued a memo with actions for federal agencies with respect to PPSR. Among the public communities that NSF serves, this Agency Priority Goal is relevant and timely. It addresses the need for investments in PPSR as articulated in recent journals, such as Science; at conferences, such as the citizen science pre-conference workshop at AAAS in 2015; and by practitioner organizations, such as the Citizen Science Association. Relationship to agency strategic goals and objectives PPSR projects have both scientific value and educational value. Thus, PPSR supports NSF Strategic Goal 1, Objective 1 ("Invest in fundamental research to ensure significant continuing advances across science, engineering, and education") and NSF Strategic Goal 2, Objective 2 ("Build the capacity of the Nation to address societal challenges using a suite of formal, informal and broadly available STEM educational mechanisms"). Key barriers and challenges to its achievement 1. Coordinate cross-program and cross-directorate investments that enhance both an understanding of and ability to implement PPSR approaches. 2. Manage expectations among colleagues across the federal government and public sphere as PPSR is further developed to support their daily work. External factors OSTP and the Federal Community of Practice for Citizen Science and Crowdsourcing (FCPCSC) have directly contributed to development of this Agency Priority Goal. In addition, activities by federal agencies and offices related to open innovation, citizen science, and crowdsourcing research will inform the state of the field with respect to challenges and opportunities in PPSR.
Statement: Improve the Nation’s capacity in data science by investing in the development of human capital and infrastructure. By September 30th, 2015, implement mechanisms to support the training and workforce development of future data scientists; increase the number of multi-stakeholder partnerships to address the nation’s big-data challenges; and increase investments in current and future data infrastructure extending data –intensive science into more research communities.
Description: Innovative information technologies are transforming the fabric of society, and data represent a transformative new currency for science, education, government, and commerce. Data are everywhere; they are produced in rapidly increasing volume and variety by virtually all scientific, educational, governmental, societal and commercial enterprises. (For more information see “Dealing with Data,” Science Magazine, Volume 331, February 11, 2011.) Today we live in an era of data and information. This era is enabled by modern experimental methods and observational studies; large-scale simulations; scientific instruments, such as telescopes and particle accelerators; Internet transactions, email, videos, images, and click streams; and the widespread deployment of sensors everywhere – in the environment, in our critical infrastructure, such as in bridges and smart grids, in our homes, and even on our clothing. Every day, 2.5 quintillion bytes of data are generated – so much that 90 percent of the data in the world today has been created in the last two years alone (http://www-01.ibm.com/software/data/bigdata/). It is important to note that when we talk about big data it is not just the enormous volume of data that needs to be emphasized, but also the heterogeneity, velocity, and complexity that collectively create the science and engineering challenges we face today. In December 2010, the President’s Council of Advisors on Science and Technology (PCAST) published a report to the President and Congress entitled: Designing a Digital Future: Federally Funded Research and Development in Networking and Information Technology. (http://www.whitehouse.gov/sites/default/files/microsites/ostp/pcast-nitrd-report-2010.pdf) In that report, PCAST pointed to the research challenges involved in large-scale data management and analysis and the critical role of Networking and Information Technology (NIT) in moving from data to knowledge to action, underpinning the Nation’s future prosperity, health and security. Through long-term, sustained investments in foundational computing, communications and computational research, and the development and deployment of large-scale facilities and cyberinfrastructure, federal agency R&D investments over the past several decades have both helped generate this explosion of data as well as advance our ability to capture, store, analyze, and use these data for societal benefit. More specifically, we have seen fundamental advances in machine learning, knowledge representation, natural language processing, information retrieval and integration, network analytics, computer vision, and data visualization, which together have enabled Big Data applications and systems that have the potential to transform all aspects of our lives. These investments are already starting to pay off, demonstrating the power of Big Data approaches across science, engineering, medicine, commerce, education, and national security, and laying the foundations for U.S. competitiveness for many decades to come. But much more needs to be done, particularly in four areas: 1) basic research; 2) data infrastructure; 3) education and workforce development; and 4) community outreach. NSF can catalyze progress in these areas by developing programs to engage the research community, and by creating mechanisms to catalyze the development of people and infrastructure to address the challenges posed by this new flood of data. NSF will help increase the number of data scientists engaged in academic research, development, and implementation. As defined in the 2005 NSB publication of Long-lived Digital Data Collections: Enabling Research and Education in the 21st Century defines data scientists as “the information and computer scientists, database and software programmers, disciplinary experts, curators, and expert annotators, librarians, archivists and others, who are crucial to the successful management of a digital data collection.” Using its ability to convene diverse sets of stakeholders, NSF will promote multi-stakeholder partnerships by supporting workshops and follow-on activities that bring together representatives of industry, academia, not-for-profit organizations, and other entities to address current and future big-data challenges. NSF will also leverage existing programs, such as the NSF Research Traineeship (NRT) and the Graduate Research Fellowship (GRF) programs, and create new programs and tracks to current programs, as needed, to support the creation of more researchers and students competent in the deep analytical and technical skills required to address those challenges. NSF will develop strategies to build and sustain data infrastructure for the 21st century through CIF21. NSF will coordinate with other agencies through the National Science and Technology Council to achieve this goal.